AD-A124  194 
UNCLASSIFIED 


PROJECT  ANALVSIS  GUIDE(U)  OPERATIONAL  TEST  AND  1/2 

EVALUATION  FORCE  NORFOLK  VA  85  FEB  81 
COHOPTEVFOR- INST-2968.  8 

F/G  5/2  NL 


I 


SECURITY  CLASSIFICATION  OF  THIS  FACE  (Whan  Data  Entered) 


REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

1.  REPORT  NUMBER  2.  GOVT  ACCESSION  NO. 

A  V/  ?* 

S.  RECIPIENT'S  CATALOG  NUMBER 

4.  TITLE  (and  Subtitle ) 

Project  Analysis  Guide 

S.  TYPE  OF  REPORT  6  PERIOO  COVERED 

Handbook 

6.  PERFORMING  ORG.  REPORT  NUMBER 

COMOPTEVFOlONST-  3960 . 8 

7.  AUTHORS) 

Commander  Operational  Test  and  Evaluation  Force 

e.  contract  operant  number^; 

9.  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Department  of  the  Navy 

Commander  Operational  Test  and  Evaluation  Force 
Norfolk,  VA  23511 

10.  PROGRAM  ELEMENT.  PROJECT,  TASK 
AREA  6  WORK  UNIT  NUMBERS 

It.  CONTROLLING  OFFICE  NAME  AND  AODRESS 

12.  REPORT  DATE 

5  February  1981 

It.  NUMBER  OF  PAGES 

75 

U.  MONITORING  AGENCY  NAME  A  ADORESSfJ/  dl Iterant  ham  Controlling  Ottlea) 

IS.  SECURITY  CLASS,  (ol  thla  report) 

Unclassified 

IS  a.  DECLASSI  FI  CATION/ DOWN  GRADING 
SCHEDULE 

16.  DISTRIBUTION  STATEMENT  (ot  thla  Report ) 

■^i3  doeumeci  u-ua  been  uppruved  ■ 
for  public  re’eams  and  sale;  its  J 

distribution  1b  unlinaitsd.  | 

17,  DISTRIBUTION  STATEMENT  (of  the  abatract  entered  In  Block  30,  II  dltlarant  from  Report) 

1 

Unlimited  Distribution 

IS.  supplementary  notes 

It.  KEY  WORDS  (Continue  on  raeotaa  aide  U  nacaaaary  and  Identify  by  block  lumber) 

Test  and  Evaluation,  Analysis,  Experimental  Design,  Operational  Test  and 
Evaluation,  Project  Operations,  Tpst  Planning,  Operational  Suitability 

\  —  . 

20.  ABSTRACT  (Continue  an  raeotaa  alda  II  nacaaaary  and  Idantlfy  by  block  number) 

^Thls  document  is  designed  primarily  for  the  Operational  Test  Director  (OTD) 
and  provides  guidance  in  test  planning  and  experimental  design  for 
operational  test  and  evaluation.. 

\ 

OD  ,  1473  COITION  OF  I  NOV  «S  1$  OBSOLETE 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  (When  Data  Entered) 


mm 


1 


DEPARTMENT  OF  THE  NAVY 

COMMANDER  OPERATIONAL  TEST  AND  EVALUATION  FORCE 
NORFOLK.  VIRGINIA  23511 


COMOPTEVFOR INST  3960.8 

02B:mas 

03  February  1981 

COMOPTEVFOR  INSTRUCTION  3960.8 
Subj :  Project  Analysis  Guide 

1.  Purpose .  This  document  provides  guidance  for  various  facets  of 
analysis  in  OT&E  (operational  test  and  evaluation).  It  is  designed 
primarily  for  OTDs/OTCs  (Operational  Test  Directors/Coordinators) 
of  COMOPTEVFOR  Staff.  Subordinate  commands  may  supplement  it  as 
necessary  according  to  their  needs. 

2 •  Future  Changes 

a.  OT&E  is  a  dynamic  evolving  process;  suggested  changes  to 
this  Guide  are  encouraged.  Address  them  to  Code  02. 

b.  Project  Analysts  will  be  asked  to  comment  on  the  Guide's 
contents  annually,  in  an  effort  to  ensure  continuing  Guide  utility. 


H.  A.  FRENCH 
Deputy  Chief  of  Staff 


Distribution: 

( COMOPTEVFORINST  52 16 . 2B ) 

List  II,  1,  5 

Copy  to: 

DEPCOMOPTEVFORPAC  (50) 

CO,  VX-1/VX-4/VX-5  (50) 
COMOPTEVFORDET  Sunnyvale  CA  (10) 
COMOPTEVFORDET  Pax  River  MD  (4) 
VX-5  DET  Whidbey  Island  WA  (6) 


Contents 


Page 

References  iii 

Section  1  —  Introduction  to  Analysis 

101  —  Analysis  and  Synthesis  1-1 

102  —  Planning  and  Data  Analysis  1-1 

103  —  Scope  of  the  Evaluation  1-14 

104  —  Realism  in  the  Evaluation  1-15 

Section  2  —  Analysis  Before  Project  Operations 

201  —  OT&E  Planning  2-1 

202  —  Preparation  for  Planning  2-3 

203  —  The  Engagement  Model  2-5 

204  —  Uses  of  the  Engagement  Model  2-7 

205  —  Analysis  in  the  TEMP  2-9 

206  —  Analytical  Testing  Issues  2-11 

207  —  Analysis  in  Test  Planning  2-13 

208  —  Function/Variable  Chart  2-13 

209  —  Supplementing  Sample  Size  2-16 

210  —  Side-by-Side  2-17 

211  —  Factorial  2-19 

212  —  Paper  Rehearsal  2-23 

213  —  Pretesting  2-23 

214  —  Bias  2-24 

215  —  Steps  in  Test  Planning  2-26 

216  —  Methodology  Steps  for  Quantitative  Ele-  2-27 

ments 

217  —  Determining  Suitability  Objectives  2-29 

218  —  Methodology  Steps  for  Qualitative  Ele-  2-30 

ments 

219  —  Determining  Project  Operations  2-32 

Section  3  —  Analysis  During  Project  Operations 

301  —  Introduction  3-1 

302  —  On  Scene  Preparations  3-1 

303  —  Analysis  During  Project  Operations  3-2 

Section  4  —  Analysis  After  Project  Operations 

401  —  Introduction  4-1 

402  —  Steps  4-1 

Section  5  —  Suitability  Considerations 


Contents  (Cont'd) 


501  —  Scenario  Approach 

502  —  Operational  Suitability 

503  —  Data  Collection  and  Processing 

504  —  Reliability 

505  —  Maintainability 

506  —  Availability 

507  —  Logistics  Supportability 

508  —  Compatability 

509  —  Interoperability 

Section  6  —  Support 

601  —  Guidelines 

602  —  Missions/Threats/Scenarios/Criteria 

603  —  Instrumentation/Data  Processing 

604  —  Simulation/Garaing/Modeling 

605  —  Project  Analysis 

606  —  Types  of  Support  Augmentation 

Section  7  —  Summary 

701  —  Purpose  of  OT&E 

702  —  Realism 

703  —  Test  Data 

704  —  Analysis  in  OT&E 

705  —  MOE  Approach 

706  —  Operational  Effectiveness  and  Operational 

Suitability 

707  —  System-Level  Testing 

708  —  Mission  Orientation  in  Testing 

709  —  Responsibilities  in  OT&E 

710  --  The  Stress  in  OT&E 


Section  8  —  Glossary  of  Special  Analytical  Terms 


References 


(a)  Analyst's  Notebook,  COMOPTEVFOR  INSTRUCTION  3960.7.  This  is 
a  handbook  of  data  analysis  procedures  to  supplement  standard 
analysis  texts.  The  coverage  is  highly  selected,  non-mathematical , 
and  straightforward.  While  the  aim  is  to  refresh  the  newly  arrived 
analyst,  the  OTD  may  find  certain  portions  of  interest. 

(b)  M.  G.  Natrella,  Experimental  Statistics,  National  Bureau  of 
Standards  Handbook  91,  U.S.  Government  Printing  Office,  Washington, 
DC,  1966.  If  the  OTD  is  stuck  without  analysis  help,  this  is 
recommended.  The  Table  of  Contents  is  descriptive  as  to  the 
problems  covered  -  comparing  averages  of  two  materials,  etc.  The 
analysis  techniques  are  given  in  step  fashion  -  in  general  terms 
and  correspondingly  with  an  illustration.  This  is  an  excellent 
cookbook. 

(c)  S.  Siegel,  Nonpar ametric  Statistics  For  the  Behavioral 
Sciences,  McGraw-Hill,  New  York,  1956 .  This  is  an  excellent  cook¬ 
book  for  analysis  of  count  type  data  (hits/misses,  yes/no  type). 
Each  technique  is  illustrated  with  a  worked-out  example.  The 
coverage  of  count  type  data  analysis  techniques  is  extensive. 


iii  PAGE  iv,  REVERSE,  BLANK 


Section  1 

Introduction  To  Analysis 

101.  Analysis  and  Synthesis.  In  the  broad  sense,  analysis 
is  dividing  the  whole  into  smaller  and  smaller  pieces  until 
they  become  manageable  for  direct  treatment.  For  example,  in 
project  planning  the  general  objectives  may  lead  to  a  subobjective 
to  determine  probability  of  kill,  which  in  turn  may  include  proba¬ 
bility  of  detection  as  a  sub-subobjective.  An  important  phase 

of  analysis  is  synthesis  in  the  opposite  direction  —  combining 
pieces  into  larger  and  larger  pieces.  For  example,  the  probability 
of  detection  that  has  been  determined  may  be  combined  with  results 
on  classification,  acquisition,  etc.,  to  derive  a  probability  of 
kill  that  may  in  turn  be  combined  with  other  measures  to  address  a 
general  project  objective. 

102.  Planning  and  Data  Analysis.  Each  analysis  step  in  planning 
with  finer  and  finer  subdivisions  has  a  counterpart  in  synthesis 
after  project  operations  that  leads  to  reporting  of  larger  and 
larger  combinations.  (See  Figure  1-1.)  Thus,  analysis  in  planning 
is  directly  related  to  data  analysis  in  reporting.  The  words  may 
not  be  the  same,  but  the  only  real  difference  is  the  possible 
influence  of  surprises  during  project  operations.  This  backs  up 
the  principle:  The  sooner  the  OTP  visualizes  the  complete  form 
and  structure  of  the  final  report,  the  better  the  planning. 

The  analysis  and  corresponding  synthesis  applies  to  project  work 
at  all  stages.  The  measurement  spectrum  is  an  example.  After 
testing,  analysis  is  characterized,  broadly  speaking,  by  a  contin¬ 
uous  summary  process.  Thousands  of  measurements  taken  during  testing 


General 


Objectives 


Sub-Objective 


Sub^ -Objective 


3 

Sub  -Objectives 


Planning 


'•Evaluate  the 
effectiveness  of. 


•'Determine  Prob. 
of  kill . "  - 


"Determine  Prob. 
of  detection. ..." 


I 


"Determine  range 
at  detection. . . . "  ■ 


Reporting 
"The  effectiveness 


is. 


"The  Prob.  of  kill 
was - " 


"The  Prob.  of  detection 
— £>  .  was . " 

A 


"The  average  range 
was . .  •' 


During  Testing 


Figure  1-1 

Relationship  Between  Planning  and  Reporting 


are  processed  into  data.  Data  are  analyzed  into  performance 
measures  or  MOEs  (measures  of  effectiveness).  These  are  combined 
into  a  CEM  (combat  effectiveness  measure),  one  of  the  end-products 
of  analysis,  an  input  into  the  evaluation  process.  The  same 
spectrum  is  pertinent  to  project  planning  by  going  in  the  opposite 
direction.  An  important  initial  point  is  the  CEM;  then  the 
performance  measures  (MOEs)  are  specified  as  components  of  the  CEM. 
The  next  step  is  to  determine  the  type  of  data,  which  in  turn 
leads  to  the  instrumentation  needed  to  obtain  such  data  from  the 
planned  tests.  See  Figure  1-2. 

a.  Measurements .  In  project  operations,  seldom  cam  a  perfor¬ 
mance  quantity  (e.g.,  detection  range)  be  measured  directly. 

To  determine  the  detection  range  of  a  submarine  sonar,  for  example, 
the  positions  of  the  sonar  ship  and  the  target  are  accurately 
tracked  and  recorded.  This  mass  of  measurement  is  correlated, 
and  the  range  between  the  ship  and  target  is  found  at  the  specific 
event,  detection.  This  is  done  for  each  trial  or  run.  The  pro¬ 
cedure  of  processing  measurements  into  data  --  data  processing  — 
is  usually  the  responsibility  of  a  technical  facility  such  as  a 
test  range.  Our  responsibility  is  to  determine  what  we  need,  how 
much,  how  good,  in  what  form,  and  when.  Data  processing  is  com¬ 
pleted  by  preparation  of  a  run-by-run  summary,  giving  the  test 
conditions,  pertinent  performance  data,  validity  codes,  and  remarks. 

b.  Data 

(1)  Data  are  the  result  of  processed  measurements  that 
include  definition  of  a  specific  event.  Thus,  the  distance  between 


Measure  of  Combat  Measure  of  Effectiveness  Data  Measurement 

Mission  Success  Effectiveness 


*-  \  ■- 


\  \  \  % 


.  .N  .V 


H  >i 
*4-4  <0  -P 
O  C-H 

o  <-* 

CD  -H  *H 
U  P  A 
?  id  (4 
WPP 
K  <D-H 

S81! 


Illustration  of  Dendritic  Structure 


the  sensor  and  the  target  at  a  specific  event  is  taken  as  range 
of  detection. 

(2)  Data  are  treated  by  data  analysis;  we  analyze  the  number 
of  hits,  times  to  acquire,  ranges  at  detection,  etc.  Each  function 
such  as  detection  or  acquisition  is  analyzed  separately.  Basically, 
this  analysis  is  to  determine  the  effects  of  the  different  test 
conditions.  This  differentation  sets  limits  to  the  summarization 
process  —  should  the  data  be  summarized  into  one  average,  or 
should  separate  averages  be  reported  for  different  conditions?  The 
analysis  and  subsequent  summaries  provide  information  that  is 
reported  as  Results.  "The  average  range  of  detection  for  slow 
targets  was  90  nmi;  for  fast  targets  it  was  70  nmi." 

(3 y  Initially,  each  type  of  data  is  analyzed -separately. 

Then  some  types  are  combined.  For  example,  a  cumulative  detection 
summary  may  include  p^alysis  of  detection  range  and  also  no-contacts. 
Larger  summaries  or  MOEs  are  made  by  combining  summary  measures 
into  other  HOEs.  These  summary  measures  differ  in  degree.  Basically, 
they  are  characterized  by  combining  more  and  more  results  into  MOEs . 

(4)  Data  can  be  continuous  (e.g.,  the  miss  distance  was 
6.2  ft.),  or  can  be  categorized  (e.g.,  hit,  no-contact,  failure), 
Continuous  type  data  are  usually  expensive  to  obtain,  involving 
a  high  order  of  instrumentation  and  data  processing;  to  categorize 
a  run  outcome  as  hit  or  miss,  success  or  failure,  is  cheap,  rela¬ 
tively.  However,  more  information  is  available  in  continuous 
type  data.  Knowing  that  detection  occurred  at  34,000  yds  tells 
us  more  than  knowing  that  detection  has  occurred.  The  added 


information  per  run  may  result  in  the  need  for  less  runs.  Thus, 
there  is  a  tradeoff.  We  usually  fund  the  instrumentation,  etc., 
needed  to  get  continuous  type  data;  the  payoff  comes  with  reduced 
fleet  service  requirements. 

(5)  More  and  more  in  OT&E,  qualitative  data  are  becoming 
more  critical.  Qualitative  data  are  frequently  obtained  through 
highly  structured  questionnaires  filled  in  by  well-qualified 
observers,  operators,  etc.  These  data  may  be  supplemented  by 
interviews  and  debriefings.  With  qualitative  type  data  we  try 
to  get  a  consensus,  so  that  the  basis  for  results  is  as  broad  as 
possible, 

(6)  The  accuracy  needed  in  data  has  an  important  influence 
on  the  cost  of  instrumentation  and  data  processing.  Usually  the 
accuracy  needed  in  OT&G  is  gross  compared  to  technical  testing. 
However,  in  a  project  with  a  small  sample  size  there  may  be 
increased  need  for  instrumentation  and  data  processing.  For 
example,  not  all  phenomena  are  probabilistic;  not  all  events  are 
statistical.  A  technical  examination  of  an  event  such  as  no-  . 
detection  or  a  hardware  failure  may  indicate  a  technical  cause  or 
design  deficiency.  The  review  may  indicate  a  deterministic 
situation,  and  firm  results  may  be  formed  even  with  small  sample 
sizes.  So,  as  usual,  there  is  a  tradeoff  as  to  need,  cost,  etc. 

c.  Measures  of  Effectiveness 

(1)  The  classical  definition  of  MOE  is  that  it  is  a  numer¬ 
ical  measure  of  how  well  a  task  is  done  or  an  objective  is  met. 

For  a  sensor,  the  MOE  may  be  the  probability  of  detection.  For 
a  missile  system,  the  MOEs  may  include  the  number  of  antiship 


missiles  that  can  be  engaged  successfully.  For  a  surface  ship, 
an  MOE  may  be  the  likelihood  of  an  escbrted  ship  not  coming  under 
torpedo  attack  during  an  ocean  crossing. 

(2)  The  classical  definition  and  use  of  MOE  covers  too  broad 
a  spectrum  and  may  cause  confusion.  Hereafter,  the  term  MOE  will 
not  be  used  per  se,  but  will  be  modified.  In  a  more  narrow  sense, 
we  will  use  performance  MOE  or  functional  MOE.  Later,  for  broad 
use,  we  will  define  OEM  (combat  effectiveness  measure)  and  MOS 
(measure  of  operational  suitability).  Functional  MOEs  are 
summary  measures  based  on  data  analysis  (e.g.,  probability  of 
detection)  to  describe  a  function  (e.g.,  detection).  Thu#/  the 
percentage  of  valid  opportunities  detected  is  presented  as  a 
measure  of  detection  capability.  It  is  standard  practice  to 
obtain  these  functional  MOEs,  particularly  in  comparisons  of  ship¬ 
ment.  In  some  projects  these  functional  MOEs  may  be  sufficient 

as  the  end-product  of  analysis,  in  many  projects,  particularly 
involving  systems,  functional  MOEs  are  not  sufficient.  They  must 
be  further  combined  as  will  be  described  later. 

(3)  A  functional  MOE  depends  on  "other  things  being  equal-" 
If  so,  then  a  mine  clearing  equipment  "A"  with  a  sweep  width 

(MOE)  of  50  yd  is  better  than  equipment  "B"  whose  sweep  width  (MOE) 
is  30  yd.  Even  so,  this  functional  MOE  comparison  does  not  tell 
us  what  this  improvement  in  sweeps  width  means  operationally. 

Suppose  equipment  "A"  requires  a  longer  turnaround  time  — 
then  we  would  need  a  more  complete  measure  of  performance  that 
included  time.  We  would  perhaps  determine  the  time  each  equipment 


1-7 


took  to  clear  a  typical  mine  field.  As  you  can  see,  we  are  now 
combining  different  functional  measures:  sweep  width,  time  to 
sweep,  clearance  rate.  To  combine  more  and  more  functional  MOEs, 
we  have  to  ••model”  in  an  operational  sense.  We  have  to  decide 
what  is  a  typical  mine  field  or  fields;  what  is  an  acceptable 
standard  level;  what  are  the  tactics  to  be  used  in  mine  sweeping; 
etc.  The  process  or  modeling  becomes  more  and  more  complicated; 
the  assumptions  become  more  important;  the  calculations  become  more 
involved.  However,  we  are  obtaining  more  complete  measures;  these 
measures  are  coming  closer  to  a  measure  of  pperational  effec¬ 
tiveness  a  measure  of  the  system's  capability  to  do  its  job. 
d.  Combat  Effectiveness  Measure 

(1)  In  OT&E,  the  classical  interpretation  of.  MOE  is  too 
narrow  to  be  useful.  In  OT&E,  the  critical  numerical  value  is 
how  well  a  Naval  unit  will  perform  its  operational  mission  per¬ 
tinent  to  the  system  under  evaluation  in  directly  supporting  the 
combat  weapon  system.  This  support  concept  is  critical  in  defining 
operational  MOEs.  To  stress  this  we  use  CEM,  rather  than  the 
general  or  technical  or  non-operational  MOEs. 

(2)  For  example,  an  evaluation  of  a  submarine  sonar  may 
indicate  excellent  detection  performance.  However,  when  it  is 
interfaced  with  the  submarine's  fire  control  system,  oscillations 
may  be  so  prevalent  that  the  firing  solution  is  seriously  degraded. 
The  oscillations  are  not  caused  the  sonar  per  se  (which  is  being 
evaluated)  but  by  the  interface  (which  is  not  being  evaluated). 

For  scenarios  involving  torpedo  firing,  the  sonar  has  poor  opera¬ 
tional  effectiveness  even  though  it  has  excellent  technical 


effectiveness.  Its  MOE  is  high  but  its  CEM  is  low.  The  point  is 
that  the  submarine  commander  would  prefer  the  old  sonar  rather  than 
the  technically  improved  version  as  things  stand.  Thus,  in  evaluating 
a  submarine  sonar,  the  selected  CEM  (e.g.,  for  a  barrier-type 
mission,  the  likelihood  of  a  penetrator  being  torpedo-engaged 
before  the  platform  is  engaged)  may  be  low  even  though  the  fire 
control  system  is  the  culprit. 

(3)  Another  example:  a  decoy  device  was  very  effective  in 
decoying  a  torpedo  when  the  decoy  was  properly  launched  from  a 
submarine.  However,  the  launcher  on  a  particular  class  of  subma¬ 
rine  was  found  to  be  inadequate  for  launching  this  and  all  other 
devices.  The  decoy  was  not  operationally  effective  for  this 
submarine  class  until  the  launcher  problem  was  corrected. 

(4)  CEMs  are  seldom  measured  or  determined  directly. 

Usually  they  are  calculated,  typically  by  combining  functional 
MOEs.  Thus,  a  mission/scenario  may  be  decomposed  into  various 
functions.  Functional  MOEs  relate  to  how  well  these  functions  are 
performed.  For  example,  the  CEM  for  an  air-to-air  missile  may  be 
an  exchange  ratio  —  the  probability  of  the  target  being  destroyed 
before  the  missile  aircraft  is  destroyed.  The  CEM  could  be  deter¬ 
mined  by  combining  (e.g.,  by  conditional  probabilities)  success 
measures  of  detection,  positioning,  launch,  guidance,  fuzing,  and 
kill. 

(5)  There  are  many  types  of  CEM  —  the  most  complete 
measures  of  operational  effectiveness  pertain  to  exchange  ratios. 

These  ratios  take  platform  survivability  and  attack  effectiveness 
into  account.  Usually  this  measure  is  used  for  whole-ship  or 


aircraft  evaluations.  In  evaluations  involved  with  weapon  systems, 
platform  vulnerability  may  not  specifically  be  taken  into  account. 
The  most  common  measures  used  then  are  restricted  to  some  form  of 
kill  probability.  For  a  torpedo  weapon  system  the  "weapon  system 
effectiveness"  is  the  probability  of  kill  of  a  single  submarine 
or  surface  ship  (one-on-one  situation).  For  AAW  the  "saturation 
rate"  pertains  to  the  number  of  targets  killed  in  a  mass  attack. 

In  ASMD,  a  variation  of  probability  of  kill  is  used,  pertaining 
to  surviving. 

(6)  Here  are  some  actual  CEM  results  (camouflaged  slightly 

i 

for  security  reasons ) : 

(a)  In  a  self-protection  situation  under  a  wave 
attack  at  low  altitude  a  TERRIER,  double-ender ,  dual-launched  ship 
at  Condition  I,  has  a  saturation  rate  of  eight  targets.  At  high 
altitude  it  is  18  targets. 

(b)  The  estimated  probability  of  seaworthy  impairment 

»  • 

was  about  0.25  for  deep  targets.  For  shallow  targets  the  esti¬ 
mated  probability  of  seaworthy  impairment  was  lower  and  varied 
strongly  by  situation. 

In  the  latter  example,  the  "probability  of  kill"  values  were 
derived  from  "conditional  values"  using  components  of  detection 
(P.),  location  (P,  ),  aimpoint  (P_),  missile  delivery  (Pm),  and 
lethal  radius  (Pr).  The  latter  component  was  obtained  from  a 
published  reference. 

CEM  =  PH  .  P.  .  P_  .  P.  P_ 
d  l  a  m  r 

Note  that  CEM  pertains  to  performance  assuming  that  the  system 
is  up  in  a  materiel  suitability  sense. 

1-10 


e .  Measure  of  Operational  Suitability 

(1)  The  system  must  be  available  for  use  when  needed  and 
continu3  to  perform  in  the  materiel  reliability  sense  throughout 
its  mission.  MOS  is  usually  related  directly  to  operational 
availability.  For  a  continuous -use  type  system  the  MOS  may 

be  simply  the  percentage  of  time  that  the  operational  commander 

\  > 

can  use  the  system  compared  to  the  total  time  needed  during  the 
mission.  For  expendable  items  such  as  missiles,  the  materiel 
reliability  success  ratio  includes  the  likelihood  of  having  a 
"good  missile  at  launch"  as  well  as  the  materiel  success  during 
attack.  If  repairs  cannot  be  made  during  the  mission,  such  as 
during  an  aircraft  sortie,  then  the  MOS  treatment  may  be  akin 
to  that  for  expendable  items. 

(2)  The  analysis  structure  for  performance  applies  also  for 
reliability.  Measurements  include  a  breakdown  of  total  test  time 
into  various  categories;  data  include  times  to  failure,  times  to 
repair;  MOS  includes  mean  time  to  failure,  mean  time  to  repair. 
These  are  finally  combined  into  an  overall  MOS.  As  with  MOEs, 

MOSs  may  have  to  be  determined  separately  for  various  scenarios/ 
missions,  etc. 

(3)  Usually  a  system  includes  a  series  of  sequential  func¬ 
tions  or  modes;  the  MOS  may  take  the  form  of  a  product  of  condi¬ 
tional  reliabilities. 

MOS  =  R^  .  R2  •  Rj  •  *  • 

On  the  other  hand,  the  system  may  include  redundancy,  turnaround, 
etc,  and  reliability  modeling  may  be  quite  extensive. 


1-11 


f.  Measure  of  Mission  Success.  CEM  combined  with  availability 
and  reliability  (MOS)  is  called  MOMS  (measure  of  mission  success). 
Then  this  combined  measure  is  a  more  complete  and  realistic 
figure  of  what  could  be  expected  in  an  operational  sense. 

MOMS  =  CEM  .  MOS 

If  the  product  of  conditional  values  applies  we  have 

MOMS  =  P1R1  .  P2  R2  .  P3  R3  . . . 

Usually,  however,  a  high  order  of  modeling  effort  is  involved. 

Even  so,  important  elements  will  not  be  taken  into  account  and 
have  to  be  mentioned  in  "Limitations  to  Scope."  For  example, 
materiel  reliability  of  the  threat,  vulnerability  of  the  platform, 
etc. 

g.  Elements  of  Essential  Analysis 

(1)  The  dendritic  or  branching  structure  of  decomposing 
an  objective  into  subobjectives  is  a  standard  military  planning 
technique.  For  example,  the  Army  and  Mitre  Corp  have  implemented 
this  into  an  EEA  (element  of  essential  analysis).  An  objective  is 
structured  into  smaller  and  smaller  pieces  until  manageable  and 
specific.  Then  a  subobjective  is  formed,  called  EEA.  This  is 
followed  by  CQs  (checklist  questions)  that  can  be  answered  by 
data,  yes/no,  or  a  word. 

(2)  For  example,  Mitre  used  the  EEA  approach  in  AEGIS 
planning.  An  operational  question  or  "critical  issue"  in 
effectiveness  was  divided  into  three  branches.  One  of  these 
branches  was  divided  into  seven  limbs,  one  of  which  was:  "What 
was  the  contribution  of  AEGIS  to  overall  task  force  early  warning?" 
This  was  divided  into  four  EEAs  --  e.g., 


"What  percentage  of  first 


detections  was  provided  by  AEGIS?"  The  corresponding  CQ  was 
"How  many  initial  detections  were  made  by  each  surveillance  system? 

h.  Methodology.  The  EE A  approach  and  the  MOMS  approach  are 
basically  the  same.  However,  use  of  each  technique  in  its  place 
can  simplify  planning.  The  methodology  outlined  in  paragraph  215 
combines  both  approaches . 

(1)  The  CEM  and  MOS  approach  is  used  for  quantitative 
elements . 

(2)  The  "critical  issue"  approach  and  the  EEA  technique 
are  used  for  qualitative  elements. 

i.  Implementation .  Although  the  dendritic  analysis  structure 
is  simple  and  straightforward,  its  application  to  a  specific  pro¬ 
ject  is  usually  complex,  time-consuming,  and  more  of  an  art  than 

a  science.  For  example,  defining  the  functional  MOE  or  CEM  is 
only  a  part  of  the  picture,  and  is  often  relatively  insignificant 
in  effort  compared  with  implementation  of  the  definitions. 

The  complexity  stems  from  the  fact  that  the  effort  is  not  to 
summarize  what  happened  during  project  operations  per  se.  The 
aim  is  to  predict  the  future  outcome  in  a  cold/hot  war  situation, 
when  and  where  it  will  occur,  against  likely  threats,  etc. 

Obviously  we  can  only  try  and  take  as  many  elements  into  account 
as  feasible. 

(1)  In  calculating  CEM  and  MOS,  the  missions  of  the  sys¬ 
tem  and  likely  scenarios  are  paramount.  For  example,  in  ASW, 
will  the  system  be  used  in  a  barrier  or  patrol,  or  both?  Will 
we  have  prior  intelligence?  What  type  of  targets?  Will  we 
have  an  alerted  target?  Will  we  be  operating  alone  or  in  company? 


These  and  many  other  questions  have  to  be  answered. 

(2)  The  process  of  forming  and  using  the  scenarios, 
bringing  the  environment  into  the  picture,  determined  target 
evasion,  etc.,  involves  a  high  order  of  skills  and  experience. 
Extensive  modeling  must  be  done.  Usually  a  computer  is  involved. 

See  paragraphs  203  and  604.  In  some  fields  involved  with  weapons 
systems,  modeling  has  already  been  accomplished  and  models  are 

i  i 

available  that  are  readily  adaptable  to  our  needs.  In  areas  such 
as  command  and  control  and  communications,  we  have  to  be  content 
with  a  limited  scope:  functional  MOEs. 

(3)  The  emphasis  in  this  analysis  section  is  on  quantitative 
measures.  This  does  not  imply  that  qualitative  aspects  are  not 
important.  Except  for  the  obvious,  there  is  not  much  that  can 

be  said  about  the  qualitative  aspects.  See  paragraph  214. c. 

j.  Terminology.  The  above  terminology  (MOE,  CEM,  MOS)  is  not 
standardized  in  the  evaluation  community.  Others  usually  use 
only  functional  MOEs  to  cover  the  entire  spectrum  of  measures. 

For  example,  a  weapon  system  evaluation  may  have  30  different 
functional  MOEs  ranging  from  average  detection  range  to  probability 
of  kill.  These  functional  MOEs  may  or  may  not  include  reliability. 
Obviously  it  is  important  to  know  what  is  included  in  and  what  is 
excluded  from  each  measure. 

103.  Scope  of  the  Evaluation 

a.  The  Navy's  need  for  the  system  under  evaluation  includes  a 
variety  of  aiming  conditions,  a  variety  of  targets,  a  variety  of 
installations,  etc.  Let's  explore  this  further.  If  the  system 
under  evaluation  is  introduced  into  the  fleet,  it  will  be  installed, 


not  only  on  our  test  ship,  but  on  the  many  ships  of  its  class. 

It  will  be  used,  not  by  our  test  crew  on  a  four  runs  a  day  basis, 
but  by  a  variety  of  crews  under  different  levels  of  stress;  not 
in  a  test  range,  but  in  the  North  Atlantic  and  the  Med.  The  targets 
will  not  be  a  Guppy  class,  but  a  variety  of  classes  with  a  variety 
of  characteristics  doing  a  variety  of  missions,  maneuvers,  and 
evasive  tactics.  We  can  build  up  an  impressive  list  of  varieties. 

b.  The  important  point  is  that  our  evaluation  pertains  to 
this  variety.  CNO  is  not  interested  per  se  in  what  happened 
last  June  in  the  warm  waters  of  Key  West  with  a  special  installa¬ 
tion  and  an  alerted  crew.  Our  only  interest  in  the  happening  last 
June  is  obtaining  information  on  what  will  happen  under  the  scope 
and  variety  of  possible  situations.  Therefore,  in  our  evaluation, 


it's  better  to  cover  as  broad  a  scope  as  possible,  rather  than 
look  in  detail  at  a  small  portion.  Our  job  is  to  sketch  out  the 
forest,  not  to  describe  a  few  trees  in  detail.  How  wide  a  scope 
should  be  strived  for?  Analytically  speaking,  the  answer  depends  on 
the  analysis.  If  we  can't  analyze  the  data,  our  results  may 
be  meaningless  or  misleading.  There  is  a  tradeoff  that  we  will 
explore  later. 

104.  Realism  in  the  Evaluation.  In  addition  to  having  a  broad 
scope,  our  testing  must  strive  for  realism.  Realism  pertains  to 
the  expected  use  of  the  system  under  test.  It  includes  use  of 
sailors  (rather  than  civilian  technicians),  personnel  trained  as 
they  would  be  trained,  number  of  personnel,  rank  of  personnel, 
etc.  The  most  important  element  in  this  striving  for  realism  is 


the  use  of  realistic  scenarios,  including  threats,  tactics,  etc. 
Our  tests  are  as  realistic  as  possible  within  the  numerous  and 
obvious  limitations  in  trying  to  run  a  miniwar.  Usually  our  test¬ 
ing  is  guasi-realistic  —  realistic  enough  so  that  the  results 
can  be  projected  to  the  future,  but  controlled  enough  to  force 
encounters  and  to  enable  reconstruction  for  analysis  purposes. 

This  tradeoff  will  be  explored  later. 


Section  2 

Analysis  Before  Project  Operations 


201.  OT&E  Planning.  With  respect  to  OPTEVFOR,  the  entire  OT&E 
planning  effort  is  a  continuous  analysis  effort.  The  beginning 
is  a  broad  project  purpose,  then  a  gross  division  into  areas, 
a  more  detailed  division  in  a  master  plan,  still  more  details 
in  a  test  plan,  finer  detail  in  a  series  of  tests,  and  finally, 
within  each  test,  a  vernier  as  to  data  and  measurements.  Since 
steps  are  interrelated  and  vary  in  degree  from  project  to  project, 
the  analysis  process  is  best  handled  as  a  continuous  process,  and 
not  divided  into  master  plan,  test  plan,  etc. 

a.  Purpose .  The  project  purpose  is  usually  stated  in  broad 
terms  like  "...  to  determine  the  operational  effectiveness  and 
operational  suitability  of  the  XXX  System."  While  not  stated 
directly,  the  following  points  are  implied  because  they  are 
critical  in  delineating  the  purpose:  missions,  scenarios,  threats, 
fleet  criteria,  and  system  functions. 

b.  Mission.  Mission  refers  to  the  military  objectives 
related  to  the  tasks  to  be  done.  Missions  and  tasks  refer  to  the 
battle  element  that  the  system  being  evaluated  supports.  For  a 
radar  antenna,  the  battle  element  may  be  the  radar  system.  For 

a  gun  system  the  battle  element  may  be  the  ship  on  which  it  is 
installed.  For  a  "whole  ship"  evaluation,  the  battle  element  may 
be  the  Task  Group  itself.  Usually  a  system  has  different  missions 
and  tasks.  Some  tasks  are  independent  of  each  other,  others  may 
not  be.  It  may  be  possible  to  attempt  to  weigh  each  mission  and 
combine  results  analytically.  However,  this  is  not  recommended. 


Each  mission  is  better  treated  separately  in  planning  and  in 
reporting . 

c.  Scenarios .  A  scenario  is  in  effect  a  sequential  description 
of  events  occurring  during  an  engagement.  It  indicates  the 
action  during  an  encounter  in  the  order  of  its  development.  The 
scenario  details  and  directs  our  interest.  It  translates  the 
missions  and -tasks  to  a  workable  level.  It  is  fruitful  in  deter¬ 
mining  functions,  issues,  test  conditions,  etc.  The  scenarios 

are  key  elements  in  determining  project  operations. 

d.  Threat.  An  important  aspect  of  scenarios  is  the  opposition 
or  threat  —  including  the  type,  quantity,  capability,  and  expec¬ 
ted  reactions  opposing  our  carrying  out  the  assigned  tasks. 
Basically,  this  is  a  scenario  from  the  opposition  point  of  view. 

Be  careful  to  not  over-  or  underestimate  the  threat  and  to  refer¬ 
ence  the  proper  time  frame  —  when  the  system  will  be  introduced 
into  the  fleet. 

e.  Fleet  Criteria.  As  we  consider  the  various  tasks,  we  need 
to  establish  what  we  consider  to  be  performance  in  those  tasks. 

Such  criteria  should  be  as  general  as  possible,  and  should  reflect 
what  is  necessary  in  order  to  be  effective,  not  what  is  just  nice 
to  have.  For  example,  in  the  FFG  7  ASW  task  of  defending  the 
escorted  force,  our  criterion  for  judging  effectiveness  should  be: 
losses  to  the  escorted  force.  Note  that  this  criterion  only 
states  what  is  important  for  the  task.  A  second-order  criterion 
in  this  case  would  be  destruction  of  the  attacking  force  —  such 
destruction  would  be  nice,  but  if  we  can  prevent  losses  to  the 
protected  force  in  any  way  possible,  we  will  have  successfully 


r«i 


carried  out  our  task.  We  need  to  define  general  criteria  for 
each  assigned  task.  Note  that  the  criteria  at  this  stage  do 
not  include  numerical  goals  or  specifications. 

f.  System  Functions.  Description  of  the  system  is  formally 
Drought  into  our  procedure.  The  description  is  not  a  technical 
description  —  our  interest  is  not  in  the  different  modes,  per 
se,  but  in  the  different  functions  in  an  operational  sense. 

g.  Preliminary  Tactics.  We  have  to  know  how  to  use  the  system 
under  test.  Early  in  OT&E,  of  course,  the  tactics  can  only  be  pre¬ 
liminary.  But  the  point  is  that  the  tactics  used  during  the 
evaluation  directly  influence  the  results.  Considerable  thought 
should  be  giver  to  the  tactics  to  be  used  —  it  may  be  wise  to 
task  some  Dev  Group  to  propose  the  tactics.  We  should  be 
prepared  to  defend  the  tactics  used,  particularily  if  the 
evaluation  involves  two  competing  designs. 

h.  MOMS.  MOMS  is  the  probability  that  the  battle  element 
using  the  system  under  evaluation  will  perform  its  operational 
mission.  As  stated  paragraph  102. f.,  MOMS  is  a  combined  effec¬ 
tiveness  and  suitability  measure;  it  is  a  derived  or  calculated 
measure,  based  on  a  series  of  conditional  probabilities  or 
functional  MOEs,  many  of  which  may  not  be  directly  raeasureable 
during  OT&E.  Inputs  may  be  necessary  from  other  sources.  An 
illustration  of  a  MOMS  is  the  ASW  task  for  the  FFG.  The  criterion, 
protection  of  the  escorted  force,  when  translated  into  measurement 
terms  is:  losses  in  the  escorted  force;  this  in  turn  leads  to 
number  of  escorted  ships  lost  per  attack . 

202.  Preparation  for  Planning.  This  paragraph  gives  broad 


2-3 


guidance  in  planning,  to  indicate  the  direction  of  the  planning 
effort.  Detailed  steps  in  planning  are  given  in  paragraph  215. 

a.  The  general  project  purpose  is  divided  into  warfare  areas 
and  then  into  effectiveness  (performance  or  capability)  and 
suitability  (reliability,  availability,  human  factors)  objectives. 
Each  of  these  is  best  treated  separately,  at  least  initially. 

It  is  understood  that  a  CEM  for  performance  will  be  combined 
with  an  MOS  for  availability  and  a  MOMS  formed  for  the  final 
report . 

b.  A  CEM  is  created  for  each  warfare  area  based  on  the 
various  missions  and  corresponding  fleet  criteria.  Consider  the 
performance  for  one  warfare  area,  say  ASW. 

(1)  Escort.  CEM  may  be  “Probability  of  a  convoy  ship  sur¬ 
viving  a  submarine  attack." 

(2)  Patrol.  CEM  may  be  "Probability  of  submarine  attack 
leading  to  submarine  being  sunk." 

c.  For  each  CEM,  the  analysis  effort  proceeds  in  two  structures 

(1)  One  is  to  bring  the  different  scenarios  and  threats 
into  account.  This  structure  aids  in  test  design  and  in  deter¬ 
mining  test  situations  during  project  operations.  After  project 
operations,  we  may  find  that  different  scenarios/threats  may 
have  different  values  of  the  same  type  of  CEM. 

(2)  The  major  structure  is  in  determining  the  type  of 
calculations  to  be  used  in  deriving  the  C^EM.  In  a  realistic  free- 
play  type  of  operation,  we  may  be  able  to  determine  CEM  simply; 
e.g.,  by  a  ratio  of  number  of  succeses  to  number  of  attacks. 


More  typically  we  would  not  have  this  luxury  and  CEM  must  be 
derived  by  a  formula  such  as: 

CEM  =  •  p2  *  P3  ‘  P4  *  *  *  Pn 

where  .  Pn  are  the  conditional  probabilities  of  success  in 

steps  leading  to  the  end-objective  or  subobjective. 

d.  The  type  of  calculations  to  be  used  in  determining  the  CEM, 
once  delineated,  leads  to  subobjectives.  For  example,  if  the 
above  series  of  conditional  probabilities  are  to  be  used,  the  sub¬ 
objectives  would  be  determination  of  detection  capability,  acqui¬ 
sition  capability,  etc. 

e.  For  each  subobjective,  the  dendritic  breakdown  continues 
with  the  determination  of  the  type  of  data  needed  to  arrive  at 
each  probability. 

f .  Knowing  the  data  need,  the  next  'step  is  to  determine  the 
type  of  measurements  and  corresponding  instrumentation. 

203 .  The  Engagement  Model 

a.  The  above  steps  involve  a  high  order  of  involvement. 

Early  effort  in  an  engagement  model  can  help  —  not  only  in  forming 
subobjectives,  but  also  in  other  stages  of  our  planning. 

b.  The  engagement  model  is  descriptive'  of  the  engagement  with 
the  system  being  evaluated  in  its  battle  element  based  on  scenarios, 
threat,  and  preliminary  tactics. 

c.  Figure  2-1  gives  part  of  an  engagement  model  for  the  XXX 
System  under  evaluation  in  its  ASW  mission.  Basically,  the 
modeling  is  a  follow-through  of  specific  events  and  corresponding 
outcomes . 


2-5 


Figure  2-1 

Part  of  an  Engagement  Model 


d.  The  next  step  in  the  engagement  modeling  is  to  obtain 
guesstimates  for  the  likelihood  of  the  various  events/outcomes. 

Some  may  be  based  on  published  reports,  some  may  be  group  consensus, 
some  may  be  based  on  projections  of  contractual  specifications. 

Some  (such  as  enemy  aggressiveness)  may  be  merely  guesses  that 

may  be  handled  by  a  range  of  values. 

e.  After  the  guesstimates  are  formed,  the  arithmetic  is  fol¬ 
lowed  through  —  e.g.,  starting  with  100  submarine  attacks  and 
ending  with  some  value  of  the  CEM. 

f.  Sensitivity  analysis  can  also  be  done,  particularly  on 
••sure"  guesses  and  on  guesstimates  that  might  be  expected  to  be 
obtained  during  project  operations.  Some  critical  operational 
questions  can  be  pinpointed  quantitatively  at  this  stage. 

204 .  Uses  of  the  Engagement  Model 

a.  Analysis.  The  engagement  modeling  effort  can  help  in  the 
analysis  steps. 

(1)  By  going  through  the  calculations  to  determine  CEM, 
we  are  in  a  firmer  position  to  specify  the  type  of  formula 

to  be  used  in  paragraph  202. c.  We  can  begin  to  start  determination 
as  to  the  components  needing  project  operations,  others  already 
available,  etc. 

(2)  In  paragraph  202. d,  the  modeling  effort  leads  to  an 

\ 

obvious  subobjective:  detection  capability  for  example. 

Also,  the  model  tells  us  that  some  finer  subobjectives  are  in 
order.  For  example:  \ 

(a)  Is  the  detection  capability  better  when  System  XXX 

\ 

\ 

is  alerted  by  picket  forces?  \ 

\ 


2-7 


(b)  Is  the  detection  capability  different  when  the  sub¬ 
marine  makes  an  optimum  approach  as  opposed  to  after  evasion? 

A  finer  cut  can  be  made  of  parts  of  the  engagement  model  that  in 
turn  can  lead  to  finer  subobjectives.  For  example: 


Can  be  blown  up  to  include 


This  in  turn  leads  to  development  of  timely  attack  criteria  sub¬ 
objectives,  decoy  effectiveness  subobjectives,  etc.  This  step¬ 
ping  procedure  is  continuously  recycled  and  updated  as  inputs 
become  firmer  and  firmer.  At  first,  the  procedure  is  done  broad-brush 
Even  in  gross  terms  the  procedure  is  helpful. 


b.  Long  Lead  Items.  The  evaluation  procedure  may  indicate  the 
need  for  new  types  of  targets,  for  particular  threat  intelligence, 
or  for  a  hybrid- type  simulation.  The  sooner  these  long  lead  items 
are  addressed,  the  more  likely  they  will  be  available. 

c.  Allocation  of  Effort.  The  CEM  calculation  will  include 
components  that  are  beyond  our  test  capability  —  e.g.,  probability 
of  warhead  damage.  The  evaluation  logic  will  identify  components 
to  be  handled  by  published  reports,  by  technical  testing,  by  sim¬ 
ulation  of  some  type,  etc. 

205.  Analysis  in  the  TEMP 

a.  With  respect  to  analysis,  the  TEMP  (Test  and  Evaluation 
Master  Plan)  gives  the  logic  of  the  project  evaluation  viewed  as 
a  whole.  The  TEMP  outlines  the  information  needed,  the  sources 
to  be  used  for  the  information  including  elements  to  be  obtained 
from  at-sea  testing,  and  directly  supports  its  implementation  in 
the  Test  Plan.  Particular  attention  is  given  to  long-lead  items. 

b.  The  parts  of  the  TEMP  that  give  the  objectives,  missions, 
scenarios,  threats,  criteria,  employment  concepts,  etc.,  are  the 
basis  for  the  analysis  effort.  The  results  of  the  analysis  dendritic 
effort  are  also  given  in  the  TEMP.  The  culmination  at  the  sub¬ 
objective  stage  may  be  given  as  in  Figure  2-2.  This  working  figure 
summarizes  the  CEM  by  giving  the  functional  MOEs  needed  for  the 

CEM.  In  addition,  it  depicts  the  likely  sources  of  the  needed 
information.  The  information  in  Figure  2-2  can  be  presented  in 
other  ways.  The  purpose  is  to  indicate  as  a  concise  summary  the 
type  of  decisions  made  in  the  TEMP. 


2-9 


Scenario  1/Threat  4 
Scenario  4/Threat  4 


Mission  A  (ASW] 


P  p  p  p  p  p  p 

det  class  launch  acq  home  hit  kill 


Scenario  1/Threat  4 


B,D  B,D  U 


Scenario  2/Threat  1 


Mission  B  (ASMD! 


Tdet  Tclass  Tlaunch  Tfire  Tkill 


« • •  etc  • • • 


Source  Code 


A  OPEVAL  testing 
B  TECHEVAL  testing 
C  TECHEVAL/OPEVAL 
D  Earlier  DT  Testing 
E  Simulation 
F  Fleet  Reports 


G  Naval  Lab  Report 
H  3M  System 
I  Limitation  to  Scope 
N  Not  Pertinent 
U  Undecided 


Figure  2-2 

Objective  Source  Matrix 


c,  Each  element  in  Figure  2-2  is,  of  course,  based  on  the 
rationale  given  elsewhere  in  the  TEMP.  For  example,  P^^  given 

in  Figure  2-2  is  relegated  by  the  code  I  to  Limitations  to 
Scop**.  Why  the  CEM  is  restricted  to  Hit  rather  than  to  Kill  should 
be  explained.  For  example,  simulation  is  to  be  used  instead  of 
testing  at  sea  for  one  scenario.  The  risk  of  the  simulation  not 
being  validated  should  be  taken  into  account.  The  scope  and 
determination  of  source,  etc.,  for  the  elements  are  the  result  of 
tradeoffs  taking  schedules,  cost,  and  expected  resources  into 
account.  In  addition,  while  the  elements  in  Figure  2-2  are  not 
finely  structured,  some  fine  structure  may  actually  be  needed 
with  respect  to  long-lead  items:  specially  configured  targets, 
simulation,  etc. 

d.  The  CEM  approach  will  automatically  cover  many  operational 
issues.  These  will  pertain  mainly  to  quantifiable  issues.  How¬ 
ever,  there  are  a  host  of  issues  that  cannot  be  handled  by  the 
CEM  approach.  Particular  effort  must  be  made  to  include  these 

in  the  TEMP. 

206.  Analytical  Testing  Issues.  The  analysis  effort  for  the 
TEMP  gives  rise  to  and  is  strongly  influenced  by  analytical 
testing  issues. 

a.  Assumptions.  The  TEMP  is  based  on  many  assumptions, 
changes  to  which  may  necessitate  a  review  and  change  in  the  TEMP. 
These  assumptions  include: 

(1)  Development  schedule. 

(2)  Decision  schedule. 


(3)  At-sea  resources;  quantity,  quality,  type. 

(4)  Target  resources;  quantity,  quality,  type. 

(5)  Instrumentation  needs. 

(6)  Simulation  availability. 

(7)  Engagement  model  availability. 

(8)  Documentation  availability. 

(9)  Technical  issue  resolution. 

b.  Test  Outline.  The  various  operational  measures  should  be 
defined  specifically.  The  need  for  definition  includes  functional 
events  also.  (Summary  of  detection  success  (MOE)  will  be 
determined  by  the  range  at  the  50%  cumulative  probability,  for 
example . )  The  critical  parameters  relating  to  system  performance 
should  be  listed  and  the  expected  source  of  their  determination 
given.  The  major  emphasis  would  be  on  the  overall  test  sequence 
—  that  is,  which  MOE  or  data  would  be  obtained  from  factory 
testing,  etc.  This  outline  should  note  the  system  integration 
tests  as  well  as  other  critical  tests  such  as  the  software  inte¬ 
gration  tests. 

c.  Operational  Testing.  The  testing  under  OPTEVFOR's  direct 
control  is  amplified  in  more  detail.  Definitions,  MOEs,  scenarios, 
and  threats  become  more  specific.  The  stress  sequence  is  outlined 
--  e.g.,  single  target,  no  CM  to  multi-target  with  CM.  Dif¬ 
ferentiation  as  to  "real”  targets  vice  simulators  are  made.  Vari¬ 
ational  analysis  about  the  key  scenarios/threats  are  included. 
Preliminary  tactics  are  taken  into  account.  Data  requirements 
including  instrumentation  accuracy  needs  and  sample  sizes  are 
included. 


207.  Analysis  in  Test  Planning.  Analysis  in  test  planning,  as  an 
implementation  to  the  TEMP,  details  how  the  total  resources  are 

to  be  used  to  fulfill  the  project  objectives.  Many  areas  involve 
specific  analysis  techniques.  The  Analyst  will  actually  do  the 
methodology;  however,  many  decisions  will  have  to  be  made  in 
their  application.  The  Analyst  is  expected  to  clear  the  important 
ones  with  the  OTD.  For  example,  the  Analyst  may  have  an  alloca¬ 
tion  of  runs  at  sea  so  that  a  particular  series  will  have  three 
replications.  What  does  this  sample  size  mean  in  terms  of 
confidence?  The  OTD  would  be  interested  in  errors  such  as: 

a.  Likelihood  of  reporting  operationally  important  differences 
that  do  not  actually  exist.  This  type  of  "false  alarm"  is  called 
Type  1,  or  Alpha  (a)  error. 

b.  Likelihood  of  not  reporting  operationally  important 
differences  that  do  actually  exist.  This  type  of  "false  positive" 
is  called  Type  II,  or  Beta  (P)  error. 

208.  The  Function/Variable  Chart 

a.  While  analysis  techniques  will  be  used  by  the  Analyst, 
the  inputs  for  these  techniques  need  be  generated  jointly  by  the 
OTD  and  Analyst.  An  effective  tool  for  this  is  the  Function/ 
Variable  Chart.  The  Function/Variable  Chart  is  illustrated 

in  Figure  2-3.  This  example  concerns  a  handheld,  heatseeking, 
surface-to-air  missile.  The  columns  list  the  system  functions: 
the  functions  in  the  illustration  are  detection  by  radar  and/or 
visual  detection,  acquisition,  gyro  uncage,  and  fire. 

b.  The  rows  list  the  differences  among  the  scenario/threats, 
including  CM  and  environment.  These  are  called  the  variables. 


I 


The  checks  indicate  which  variables  are  considered  important 
for  which  function;  the  checks  indicate  the  test  variables 
by  function.  The  illustration  indicates  that  only  three  of 
the  nine  variables  should  be  testeo  with  respect  to  radar 
detection.  The  Function/Variable  Chart  also  includes  other 
features  if  available.  For  example  the  illustration  indicates 
the  expected  run  size  loss.  These  are  guesstimates  —  e.g.,  we 
may  plan  for  100  runs,  but  only  90%  may  actually  be  run  (and  valid) 
And  likewise,  because  of  non-detections,  we  may  only  have  60% 
to  determine  acquisition.  Another  useful  feature  gives  the 
expected  amount  of  data  (independent)  per  run.  For  example,  if  two 
or  more  weapons  were  to  be  used  in  each  attack,  the  target  acqui¬ 
sition  phase  would; double,  etc.,  in  sample  size.  In  other  words, 
the  number  of  runs  may  or  may  not  be  the  amount  of  data.  This 
feature  is  not  indicated  in  the  illustration,  since  only  one 
weapon  was  to  be  available. 

c.  Another  useful  feature  is  the  type  of  data  to  be  used  in 
data  analysis  to  determine  success  or  non-success  of  each 
function.  For  example,  the  illustration  gives  three  types  of  data 
for  acquisition: 

(1)  Count.  The  number  of  raids  having  acquisition  com¬ 
pared  to  the  number  having  visual  detection. 

(2)  Range.  The  range  difference  from  visual  detection 
to  acquisition. 

(3)  Error.  The  aiming  error  at  acquisition. 

d.  The  Function/Variable  Chart  gives  structure  and  direction 
to  the  discussions  between  the  OTD  and  the  Analyst.  As  such,  it 


l5 


is  an  effective  summary  and  documentation.  In  addition,  it  gives 
specific  guidance  as  to  data  and  instrumentation  needs  during 
project  operations,  and  to  the  post- test  data  analysis  — '  what 
data  to  analyze  for  what  test  variables. 

e.  More  important,  the  chart  focuses  the  discussion  on  the 
functional  approach  that,  when  services  are  tight,  may  lead 

to  other  means  to  satisfy  sample  size  needs.  These  include  com¬ 
bining  various  tests  and  scenarios,  reattack,  dry  runs,  more  than 
one  target  or  weapon,  use  of  land-based  test  site  data,  etc. 

f.  As  a  tool  for  the  Analyst,  it  may  aid  in  combining  various 
scenarios  into  one  matrix  or  factorial  approach.  This  is  an 
important  test  design  technique  for  efficient  selection  of  run 
condition. 

209.  Supplementing  Sample  Size.  Because  fleet  services  are 
tight,  obtaining  sufficient  confidence  for  our  report  demands 
a  high-level  analysis  effort  —  there  are  ways  to  increase  confid¬ 
ence  other  than  with  a  large  sample  size.  These  ways  include 
obtaining  supplementary  data  and  by  test  design. 

a.  Supplementary  Data.  Supplementary  and  complementary 
data  or  information  can  be  obtained  from  factory  testing, 
technical  testing,  land-based  testing,  simulation  testing,  etc. 

The  questions  of  pertinency  and  validity  are  critical  and  are 
the  basis,  in  part,  for  the  extensive  analysis  effort  described 
for  using  the  functional  approach,  etc. 

b.  Test  Design.  Test  design  is  analytical  jargon  for  the 
efficent  arrangement  or  combination  of  test  conditions.  Two 
simple  illustrations  will  be  given  in  the  paragraphs  to  follow. 


210.  Side-by-Side.  If  the  comparison  of  interest  is  between 
two  tactics  (or  two  modes,  etc.): 

a.  We  could  test  at  a  test  range,  one  week  with  Tactic  A 
and  another  week  with  Tactic  B.  By  extensive  instrumentation 
and  data  processing  and  analysis,  we  can  standardize  the  data 
for  environmental  changes,  etc.,  and  make  the  comparison  with 
the  standardized  data. 

b.  A  more  efficient  test  design  would  be  to  test  both  tactics 
side-by-side  or  back-to-back.  When  we  make  a  rim  with  Tactic  A, 
we  immediately  repeat  the  run  with  Tactic  B.  Each  pair  of  test 
runs  can  be  made  in  different  ocean  areas,  different  environmental 
conditions,  etc.  Differences  in  the  data  within  each  pair  would 
automatically  cancel  out  the  effect  of  the  environment  for  ocean 
area.  The  comparison  between  tactics  could  then  be  made  directly 
with  these  data  differences.  See  Figure  2-4. 

c.  Side-by-side  testing  improves  the  scope  of  the  comparison 
(different  areas,  etc. )  with  less  testing  and  reduced  instrumen¬ 
tation,  data  processing,  and  analysis.  The  efficiency  is  such 
that  this  approach  is  sometimes  used  in  comparing  two  pieces  of 
equipment  (old  versus  new)  or  in  source  selection  problems 
(manufacturer  A  versus  B).  Obviously,  we  must  be  able  to  have 
both  equipments  available  at  the  same  time  --  and  the  reliability 


ftt 


Figure  2-4 


i 

" Shoot-Out” 

Sanders  vs  Magnavox 

i 

Run 

Date 

Sea  State 

Layer  Depth 

Company 

Detection 

?*  * 

1 

7/3 

2 

50 

Sanders 

7500 

yds 

.■s 

wr:‘ 

2 

7/3 

2 

50 

Magnavox 

8300 

yds 

nn 

3 

7/12 

6  i 

200 

Sanders 

2500 

yds 

Vi 

4 

7/12 

f 

6 

200 

Magnavox 

3500 

yds 

o 

1 

o 

1 

o 

o 

o 

o 

„7_ 

o 

o 

i 

o 

i 

o 

V] 

21 

8/1 

Sanders 

6600 

yds 

22 

8/1 

i 

Maganvox 

7200 

yds 

«*•  sr-ri& 


■ 

4 


In  Transit,  data  not  available. 

I 

Data  Difference 
(Magnavox  minus  Sanders) 
800  yds 
1000  yds 


of  each  should  be  high.  This  simple  concept  can  be  extended 
to  multiple  comparisons  of  various  types. 

211.  Factorial.  If  the  primary  interest  is  testing  various 
scenarios/threats  that  cover,  for  example,  various  combinations  of 
speed,  altitude,  CM,  and  manning  conditions: 

a.  We  can,  for  example,  select  seven  "most  critical"  combin¬ 
ations  and  test  each  of  them  five  times.  Each  threat  result 
would  be  based  on  sample  size  five.  Analytically  this  test  design 
is  simple  and  straightforward.  It  is  called  one-at-a-time. 

b.  We  could,  however,  run  a}.l  possible  combinations  of 
the  various  settings  as  in  a  factorial .  The  matrix  (illustra¬ 
tion)  is  shown  in  Figure  2-5.  We  now  have  16  scenarios/threats 
of  which  the  "seven  critical"  are  included.  We  can  test  each  of 
the  16  twice.  The  total  number  of  runs  is  similar  to  the  number 
made  one-at-a-time. 

(1)  A  formal  analysis  technique  is  available  that  would 

permit  us  to  combine  the  data  in  various  ways.  For  example,  the 

\ 

32  data  runs  can  be  divided  into  four  sets  of  sample  size  eight 
each  by  speed/altitude  settings.  See  Table  2-1A.  Another  way 
would  be  by  CM  and  manning  conditions.  See  Table  2-1B.  Notice 
that  in  each  table  the  other  conditions  combined  into  each  aver¬ 
age  are  balanced  --  each  average  in  a  set  has  the  same  condi¬ 
tions  pooled. 

(2)  Table  2-1C  is  a  work  table  to  simplify  the  calcula¬ 
tions  --  it  is  merely  the  means  of  Table  2-1B,  but  in  terms  of 
differences  from  the  grand  mean  of  all^the  data. 


2-19 


M:' . 


(3)  The  result  for  each  threat  is  found  by  correspondingly 
summing  Table  2-lA  and  2-1C  according  to  the  scenario/threat  con¬ 
ditions.  For  example,  if  scenario/threat  A  is  fast,  low,  clear, 

I,  we  would  use  88.2  in  Table  2-lA  with  -3.1  in  Table  2-1C  — 
the  answer  is  85.1  seconds.  All  other  15  scenario/threat  results 
can  be  found  in  a  similar  manner. 

(4)  Notice  that  even  though  each  scenario/threat  was 
tested  twice,  these  two  runs  are  not  used,  per  se,  to  obtain  each 
scenario/threat  result.  Each  result  is  obtained  by  using  means 
in  Table  2-1,  each  based  on  a  sample  size  of  eight.  The  confid¬ 
ence  in  each  result  compares  favorably  with  those  obtained  using 
the  one-at-a-time  approach,  which  are  based  on  five  runs  each. 

Using  about  the  same  number  of  total  tests,  the  factorial  increased 
the  results  from  seven  scenario/threats  to  16. 

c.  There  is  extensive  literature  on  the  factorial  to  cover 
various  situations.  Basically,  the  larger  we  can  make  the  fac¬ 
torial  matrix,  the  more  efficient  --  for  large  factorials  we 
need  not  make  any  repeat  runs.  In  many  situations,  only  half 
or  a  quarter  of  the  factorial  matrix  need  be  tested  (with  no 
repeat  runs).  See  your  Analyst  for  details. 

d.  The  efficiency  of  the  factorial  is  because  each  data  point 
is  used  over  and  over  again  in  forming  the  various  tables  of  means. 
However,  each  missing  data  point  affects  each  table.  If  many 

are  missing,  the  efficiency  is  lost.  Therefore,  the  large  fac¬ 
torial  has  a  limited  use  in  our  work.  If  many  data  points  are 
expected  to  be  missing,  it  is  better  to  use  a  different  test  design. 


Table  2-1 


Illustration  of  Factorial  Analysis 


(Reaction  Time  in 

Seconds ) 

Table 

of  Means  (Eight  Data 

Points 

Each) 

ALTITUDE 

Speed 

Low 

Hish 

Slow 

66.0 

61.1 

Fast 

88.2 

57.5 

Table 

of  Means  (Eight  Data 

Points 

Each) 

MANNING  CONDITION 

CM 

I 

III 

Clear 

65.1 

61.6 

SOJ 

58.1 

88.0 

Means : 

Differences  from  Overall  Mean  (68. 

MANNING  CONDITION 

CM 

I 

III 

Clear 


-3.1 


-6.6 


212.  Paper  Rehearsal .  Experienced  OTDs  say:  The  best  time  to 
plan  the  conduct  of  project  operations  is  after  operations  are 
completed .  That  is,  difficulties  always  arise,  wasted  effort  is 
hard  to  avoid,  etc.  Experienced  data  analysts  say  the  same  with 
respect  to  analytical  inputs.  While  we  do  not  have  the  luxury 
of  hindsight,  we  try  to  approach  it  with  a  paper  rehearsal. 

After  the  test  scenarios,  data  requirements/  and  data  analysis 
techniques  are  decided,  the  most  critical  elements  are  rehearsed 
before  firming  up  the  Test  Plan.  This  rehearsal  can  be  a  simple 
examination  at  the  blackboard  or  a  complex  series  of  trials  on  a 
computer  simulation  duplicating  the  planned  scenarios.  The  typ¬ 
ical  rehearsal  includes  "creating"  data,  using  Murphy's  Law  for 
lost  data,  etc.  See  COMOPTEVFORINST  3960.7  for  a  description  of 
a  paper  rehearsal  and  an  illustration. 

213.  Pretesting.  The  concept  of  rehearsal  can  be  broadened  to 
include  other  types  of  pretesting.  For  example,  if  a  model  or 
simulation  is  available,  sensitivity  studies  may  give  insight 
into  the  relative  differences  among  the  various  scenarios/threats. 
Based  on  these  studies,  project  operations  may  be  refined  or 
broadened.  An  important  type  of  pretest  concerns  run  geometries. 

For  complex  geometries,  particularily  those  including  counter¬ 
measures,  a  simulated  rehearsal  may  indicate  that  certain  geometries 
would  lead  to  "no  opportunity,"  and  a  regroup  may  be  in  order. 

Even  if  no  changes  in  geometries  are  made,  the  run-by-run  outputs 
of  the  simulation  would  be  a  guide  to  the  OTD  during  the  actual 
project  operations. 


2-23 


214.  Bias .  An  important  element  in  confidence  is  the  avoidance 
of  bias  creeping  into  project  operations,  data  analysis,  or  in 
the  presentation  of  the  report.  This  is  much  broader  than  the 
OTD  maintaining  strict  objectivity  at  all  times.  The  operational 
objective  must  always  be  kept  in  mind. 

a.  Test  Plan.  The  selection  of  test  scenarios/threats  should 
be  based  on  an  expectation  of  likely  use.  This  expectation  can  be 
based  on  documentation.  In  all  cases,  the  OTD  must  use  his  judge¬ 
ment  based  on  his  operational  experience.  For  example,  in  "shoot¬ 
out"  type  testing  to  select  system  A  or  system  B,  the  scenarios/ 
threats  to  be  tested  should  be  selected  so  as  not  to  favor  either 
system.  While  both  systems  would  be  tested  with  identical  sce¬ 
narios/threats,  the  individual  tactics,  modes,  etc.,  need  not  be 
similar.  The  tactics  to  be  used  with  system  A  should  be  optimum 
for  it;  the  tactics  for  system  B  should  be  optimum  for  it. 

b.  Test  Sequence.  The  Test  Plan  should  also  consider  the 
sequence  of  testing;  extraneous,  time-varying  effects  may  distort 
our  results  unless  special  precautions  are  taken.  These  effects 
include  environmental  changes  and  the  crew  learning  curve.  Spe¬ 
cial  attention  should,  therefore,  be  paid  to  the  sequence  of 
testing.  Some  situations  may  best  be  handled  by  test  design.  In 
many  situations,  some  form  of  randomness  in  sequence  is  used. 
Randomness  is  sequencing  test  conditions  by  a  formal  procedure 
using  chance.  Shuffling  cards  labeled  by  condition  is  the  usual 
method.  Randomness  is  a  form  of  insurance  that  systematic 
extraneous  trends  do  not  distort  our  results.  As  a  form  of 


insurance,  there  is  a  premium  we  have  to  pay.  For  example,  if 
we  randomize  target  approach  altitudes,  the  time  and  fuel  needed 
to  vary  altitude  might  materially  extend  test  time  or  reduce  the 
number  of  runs.  So  a  tradeoff  is  involved.  Usually,  complete 
randomness  is  not  followed;  some  stratified  randomness  is  done 
taking  the  premium  and  the  risk  into  account.  For  example,  if 
the  crew’s  learning  curve  is  in  the  sharply  changing  region,  we 
may  decide  to  randomize  the  sequence  of  testing  conditions.  If 
in  this  case  the  premium  is  too  high  and  test  designing  cannot 
help,  we  may  be  forced  not  to  randomize  and  later  in  reporting 
comment  on  the  possible  distortion. 

c.  Qualitative  Data.  Qualitative  data  are  usually  suspect 
with  respect  to  bias  because  of  the  many  ways  bias  can  creep 

in  with  questionaires ,  ranking  methods,  etc.  However,  at  times, 
important  points  can  only  be  backed-up  with  qualitiative  data. 

In  addition,  discussion  of  results  and  operational  implications 
may  be  based  in  large  part  on  opinion.  The  point  is  that  automatic 
discounting  need  not  be  done.  We  should  report  the  basis  for 
credence  in  such  opinions.  These  include  qualification  of  the 
source  or  sources.  We  would  not  want  the  opinion  of  the  OTD 
discounted  if  he  observed  the  project  operations  and  others  present 
shared  his  opinion,  and  he  successfully  defended  his  opinion  in 
the  review  of  the  report. 

d.  Reporting.  There  is  a  thin  line  between  good  reporting 
and  biased  reporting.  Some  results  may  be  given  stronger  emphasis 
than  others.  Some  findings  may  be  considered  minor  and  not  even 


mentioned.  The  numerous  decisions  made  in  presentation  should  be 
based  on  operational  considerations.  Even  so,  the  needs  of  various 
groups  using  the  report  may  differ.  One  facet  of  reporting  — 
credibility  —  deserves  emphasis.  Not  only  should  our  report  be 
accurate,  but  it  should  also  include  credibility.  For  this  reason, 
non-standard  analysis  techniques  are  described  and  an  annex 
usually  includes  a  run-by-run  data  summary. 

215.  Steps  in  Test  Planning.  Test  planning  is  too  complex  to  be 
done  "by  the  numbers."  However,  steps  will  be  given,  to  indicate 
the  approach.  Early  in  the  planning  stage,  perhaps  for  TEMP  pre¬ 
paration,  the  methodology  given  in  paragraphs  216-219  is  begun. 

This  methodology  has  inputs:  information  on  missions,  scenarios, 
threats,  and  tactics.  Early  in  the  project,  this  information  may 
be  tentative  —  becoming  more  firm  as  time  goes  on.  These  inputs 
are  the  primary  responsibility  of  CNO  and  the  OTD.  The  analyst 
as  a  team  member  may  comment,  interpret,  etc.,  but  these  inputs 
are  not  analytical  per  se.  The  responsibilities  of  the  analyst  in 
the  steps  given  below  are  related  to  his  areas  of  expertise.  See 
Section  6.  The  methodology  is  two-fold: 

a.  The  MOMS,  CEM,  and  MOS  approaches  outlined  below  are  used 
for  quantitative  elements.  Paragraphs  216  and  217  lists  the 
steps  for  this  phase. 

b.  The  other  approach  which  is  used  in  parallel  is  the  "criti¬ 
cal  issue"  approach  using  the  EEA  technique  for  qualitative  ele¬ 
ments.  See  paragraph  218.  Paragraph  219  repeats  many  steps  given 
earlier,  but  from  a  "conduct  of  test"  point  of  view  to  indicate 
concurrency  of  these  various  paragraphs. 


i 


216 .  Methodology  Steps  for  Quantitative  Elements 
STEP 


1. 

2. 

3. 

4. 


i 


to  # 


5. 

6. 

7. 

8. 

9. 

10. 

11. 

12. 

13. 

14. 

15. 

16. 

17. 

18. 


* 

:4 


19. 


20. 


Treat  each  mission/scenario  separately. 

Consider  the  test  system  as  part  of  the  larger  system 
(determine  inputs/outputs)  it  is  supporting. 

Verbalize  the  CEM. 

Use  previous  reports  to  refine  the  CEM.  (These  reports  are 
also  useful  in  many  of  the  following  steps.) 

Refine  the  conditional  part  of  the  CEM. 

Include  a  survival  clause  in  the  CEM. 

Precisely  define  the  CEM. 

Define  each  event  in  the  CEM. 

Determine  the  functional  MOEs  in  the  CEM 
Determine  the  procedure  or  formula  for  using  the  func¬ 
tional  MOEs  to  form  the  CEM. 

Form  the  general  test  objectives  using  the  CEMs. 

Form  the  subobjectives  by  using  the  functional  MOEs. 

Plan  conduct  of  test  around  scenarios  (see  paragraph 
219). 

Form  the  function/variable  chart. 

Form  the  test  design. 

Determine  data  needed  for  the  functional  MOEs. 

Determine  data  to  be  observed  during  test. 

Determine  other  data  sources,  particularily  simulation. 
Check  amount  of  services,  number  of  runs,  and  test 
sequence . 

Outline  data  analysis. 


2-27 


Include  a  statistical  confidence  procedure. 

Include  qualitative  modifications  from  paragraph  218. 
Give  important  limitations. 

Run  rehearsal  and  sensitivity  studies  on  simulation 
or  with  paper/pencil. 

After  project  operations,  the  methodology  steps  are 
done  in  reverse  order  for  analysis. 

Individual  functional  MOEs  per  mission  are  derived. 
(Based  on  Step  16) 

The  particular  functional  MOEs  are  combined  for  all 
scenarios  if  statistically  valid. 

The  CEMs  are  found  (Using  Step  10)  per  mission. 

The  various  values  of  CEM  are  listed  by  environment, 
threats,  etc.  per  mission. 

The  CEM  values  are  modified  by  the  qualitative  elements 
in  Steps  22  and  23  and  the  EEA  approach. 

The  quantitative  values  of  the  CEM  *md  qualitative  modi 
fxcations  are  reported  as  operational  effectiveness. 
Repeat  above  for  MOS , (Paragraph  217). 

Combined  estimate  of  effectiveness  and  suitability  is 


De terming  Suitability  Objectives 


Select  a  mission  and  scenarios. 

Verbalize  MOS. 

Precisely  define  MOS. 

Define  each  event  in  MOS. 

Define  MOS  in  terms  of  functional  parameters. 

Relate  functional  parameters  to  specific  hardware. 

Determine  mission  time  per  function  per  hardware. 

Determine  method  or  formula  for  combining  functional 
parameters . 

Form  general  test  objectives  using  MOSs. 

Form  subobjectives  using  functional  parameters. 

Review  current  version  of  project  operations  plan. 

Determine  functional  parameter  data  to  be  available  from  project 
operations . 

Determine  functional  parameter  data  to  be  supplied  from  other 
sources 

Determine  if  project  operations  need  revision. 

Determine  sample  size  needs. 

Relate  to  expected  services. 

Outline  data  analysis  procedures. 

Repeat,  if  necessary,  with  other  missions. 

Relate  above  to  reliability  model,  if  available. 

Plan  to  use  such  a  model  extensively  if  one  is  available. 


218.  Methodology  Steps  for  Qualitative  Elements 
STEPS 

1.  Select  a  mission. 

2.  Review  source  documents  for  critical  questions. 

3.  Group  and  list  critical  questions. 

4.  Supplement  with  operational  questions. 

5.  Each  question  is  handled  separately. 

6.  Repeat  above  for  other  important  missions. 

7.  The  resultant  matrix  due  to  (5)  and  (6)  is  listed. 

8.  Redundancies  are  eliminated. 

9.  Collate  list  with  topics  from  functional  MOEs. 

10.  Note  specific  subobjectives  in  common  (paragraph  217). 

11.  Eliminate  specifics  better  handled  quantitatively .’ 

12.  Note  specifics  better  handled  qualitatively. 

13.  Note  specifics  only  to  be  handled  qualitatively. 

14.  Each  resultant  specific  is  considered  a  test  objective 

15.  List  sub-elements  for  each  test  and  form  subobjectives 

16.  Further  subdivision  may  be  necessary  in  few  cases. 

17.  If  a  subobjective  is  specific  enough,  it  is  termed  an 
EEA . 

18.  Checklist  questions  are  formed  to  answer  each  EEA. 
These  are  descriptive,  answered  by  yes/no,  a  word, 
or  number. 


After  project  operations  the  checklist  questions  are 
answered  individually,  and  then  combined  into  answering 
subobjectives . 

Test  objectives  are  answered  combining  quantitative 
with  qualitative  measures. 


219.  Determining  Project  Operations 
STEP 

1.  Select  a  mission 

2.  Select  key  scenarios. 

3.  Expand  description,  action  events  in  scenarios. 

i 

4.  Determine  key  elements  in  scenarios. 

5.  Use  scenarios  as  basis  for  project  operation. 

6.  Determine  elements  to  be  simulated.  ’ 

7.  Verbalize  (initial  cut)  scenario/project  operations. 

8.  Correlate  with  CEM  and  functional  MOEs. 

9.  Form  MOE  function/variable  chart. 

10.  Determine  MOE  data  needed  for  functional  MOEs. 

11.  Determine  MOE  data  to  be  available  from  project  operations. 

12.  Determine  MOE  data  not  to  be  available  from  project  operations. 

13.  Determine  MOE  data  to  be  obtained  from  other  sources. 

14.  Form  test  design. 

15.  Determine  sample  size  needs. 

16.  Relate  to  expected  services. 

17.  Outline  data  analysis  procedures. 

18 .  Run  rehearsal . 

19.  Modify  approach  if  necessary. 

20.  Repeat  above  with  another  mission. 

2 1 .  Eliminate  redundancy. 


2-32 


Section  3 


Analysis  During  Project  Operations 

301.  Introduction.  One  of  the  objectives  of  a  Test  Plan  is  to 
document  the  handling  of  the  numerous  decisions  that  must  be  made 
during  project  operations.  During  test  planning,  these  decisions 
can  be  considered  in  the  quiet  of  the  office,  and  the  OTD  is  thus 
freed  to  conduct  project  operations  unhampered  by  numerous  ques¬ 
tions.  However,  many  problems  cannot  be  foreseen,  some  decisions 
can  only  be  made  on-scene  because  prior  information  was  lacking, 
and  some  decisions  have  to  be  modified  because  of  the  on-scene 
situation.  Regardless  of  the  amount  of  planning  and  pretesting, 
the  conduct  of  project  operations  is  seldom  straightforward. 

The  way  projects  operations  are  actually  conducted  obviously  affects 
the  data  analysis  and  the  final  report.  Analysis-type  thinking, 
then,  has  an  important  part  in  many  decisions  made  on-scene. 

In  some  evaluations,  the  OTD  may  have  the  Analysts  close  by  or 
on-scene.  In  other  projects,  the  OTD  must  apply  his  own  analy¬ 
tical  background  and  experience. 

302 .  On-Scene  Preparations 

a.  On-scene,  the  OTD  quickly  realizes  that  his  Test  Plan  needs 
updating.  Many  changes  can  occur.  The  system  under  evaluation  may 
lot  be  complete,  or  a  component  may  be  down  because  of  an  on-order 
lait.  Run  geometries  may  have  to  be  modified  because  of  a  sub- 
ititute  target,  etc.  The  technical  certification  is  incomplete. 

Che  above  indicate  major  decisions,  and  obviously,  the  analytical 
ihinking  used  in  planning  will  help. 


3-1 


b.  Other  decisions  will  be  mundane  and  would  have  little 
impact.  Some,  seemingly  mundane,  may  have  important  repercus¬ 
sions.  For  example,  should  COMEX  also  include  erasing  non- test 
targets  being  routinely  tracked  by  the  test  equipment?  To  be 
operational,  we  should  not  erase;  however,  later  reconstruction/data 
processing  efforts  may  be  increased  many- fold  if  we  do  not  erase. 

c.  An  important  preparation  tool  is  pretesting  or  dry 
runs.  Dry  runs  are  rehearsal  runs,  made  in  Transit  or  in  situ, 
usually  as  complete  as  possible  except  for  missile  fire  or  use  of 
other  limited  expendable  items.  Dry  runs  are  used  to  debug  the 
control  and  logistics  of  the  operations.  Particular  emphasis  is 
put  on  the  data  instrumentation  process.  This  is  a  final  check 
that  all  non- test  areas  (such  as  range  control,  data  takers,  etc.) 
do  not  waste  runs. 

d.  Formal  briefings  to  the  test  units  involved  and  informal 
discussions  should  stress  the  objective  of  the  testing,  primarily 
that  it  is  an  evaluation.  The  aim  is  not  to  make  the  system  look 
good  or  poor;  system  results  do  not  reflect  on  the  unit  operating 
it.  If  this  is  stressed,  individual  biases  are  minimized. 

303.  Analysis  During  Project  Operations 

a.  The  OTD's  aim,  of  course,  is  to  follow  and  complete  the 
Test  Plan.  However,  surprises  may  occur  and  decisions  have  to  be 
made.  it.  is  important  to  recognize  when  a  decision  has  to  be 
made;  the  tools  used  in  planning  can  help.  For  example,  simula¬ 
tion  results  in  sensitivity  testing  may  have  indicated  the 
importance  of  environment  on  detection.  Or  the  rehearsal  runs  made 
on  a  land-based  test  site  may  overlay  the  runs  at  sea.  A 


3-2 


particularly  difficult  decision  must  be  faced  following  a  series  of 
failures.  Suppose  a  series  of  sonar  trials  begins  with  a  number 
of  no-detection  runs.  Should  we  continue  or  regroup?  The 
answer  depends  on  many  elements,  most  of  them  obvious.  For 
example,  statistical  variation  in  data  is  expected.  Even  if 
repeat  runs  are  made,  wide  fluctuations  are  not  surprising. 
Statistical  formulae  can  help.  On  the  other  hand,  a  technical 
examination  may  indicate  the  need  for  corrective  action,  and 
the  problem  is  clearly  not  statistical  but  technical.  Therefore, 
a  technical  review  of  the  operational  runs  should  be  made  contin¬ 
uously.  If  possible  this  should  be  done  on-scene. 

b.  The  OTD  should  attempt  to  keep  the  structure  of  project 
operations  constant  during  the  trials.  Care  should  be  taken  in 
varying  extraneous  (non- tested)  elements.  For  example,  in  ASMD 
the  Evaluator  may  want  to  change  his  procedure  in  the  middle  of 
the  test.  If  the  change  is  made,  with  or  without  the  concurrence 
of  the  OTD,  the  runs  affected  should  be  noted.  This  note  may  be 
an  important  guide  in  later  analysis  of  reaction  times.  This 
illustrates  the  importance  of  the  OTD's  Notebook. 

c.  The  OTD  should  review  eacn  run  continuously  or  daily. 

Data  sheets  or  complete  quick-ioox  printouts  s non .  i  be  collect  :•  ; 
and  examined.  Missing  data  or  unclear  re-nark-  sh-uld  be  tracked 
down.  Differences  in  actual  from  plannee  scenarios  should  i.e  rioted 
and  explained  if  possible. 

d.  A  running  account  of  progress  should  t-e  made.  rhrs 
includes  not  only  the  runs  made  and  still  co  be  made.  b.,t  f  1  o  l  :.c 
results  >1  each  run.  This  running  c.nut  or  check  sheet  •.*>  ;. 


3-3 


for  sequential-type  decisions  as  to  sufficiency,  priority,  and 
perhaps,  reallocation.  Specific  analysis  techniques  can  be 
used  if  they  are  prepared  for  in  advance.  For  example,  a 
sequential  decision  chart  can  be  prepared  in  the  planning  stage 
for  later  use  at  sea.  Each  run  result  at  sea  is  plotted;  when 
the  results  fall  in  a  ‘‘'stop  test"  region  of  che  chart,  the  OTD 
has  sufficient  confidence  to  reallocate  his  remaining  resources  to 
other  objectives.  The  more  typical  situation  is  in  the  closing 
days  of  project  operations  with  insufficient  resources  left  to  com¬ 
plete  the  planned  runs.  If  priorities  are  not  helpful,  this  is 
a  difficult  decision  area.  Generally  speaking,  we  would  want  to 
test  as  many  of  different  scenarios  or  planned  test  conditions 
as  possible  before  using  resources  to  make  repeat  runs. 


Section  4 

Analysis  After  Project  Operations 

401.  Introduction.  The  basic  objective  of  analysis  after  project 
operations  is  to  quantify  the  MOS  and  CEM  for  scenarios/threats 
correspondingly  found  homogeneous,  and  to  indicate  the  degree  of 
confidence  in  the  reported  quantities.  As  stated  earlier,  the 
analysis  effort  before  project  operations  forges  the  trail  or 
roadway;  post-operations  analysis  uses  the  same  roadway  in  the 
opposite  direction.  So  the  general  approach  has  already  been 
covered,  but  will  be  repeated  here.  Before  analysis  or  synthesis 
begins,  some  important  steps  must  be  taken: 

a.  Review  the  pre-operations  analysis  effort  and  Test  Plan  to 
M|  determine  the  impact  of  project  operations.  For  example,  a  parti- 

cular  scenario  may  have  only  been  tested  once  instead  of  a  planned 
number  of  times.  This  scenario  may  be  relegated  to  a  demonstration 
and  excluded  from  data  analysis. 

b.  Review  current  fleet  needs  to  determine  possible  changes 
in  emphasis  in  analysis.  A  change  of  emphasis  may  be  dictated  by 
results  during  project  operations.  For  example,  if  the  system 
was  obviously  ineffective  during  project  operations,  we  may  want 
to  analyze  some  areas  (causes)  in  greater  depth  than  planned. 

c.  Review  reports  of  technical  testing  for  insights  into 

* 

possible  relationships  or  for  possible  use  of  technical  results 
and/or  data . 

402.  Steps 

a.  The  measurements  taken  during  project  operations  and  the 

S-V 

subsequent  data  processing  are  the  basis  for  analysis.  Experience 


4-1 


indicates  that  this  area  is  filled  with  unexpected  problems.  Some 
types  of  data  may  not  be  recoverable.  Decisions  have  to  be  made 
at  various  states.  These  decisions  include  a  dropping  of  some 
types  of  data  because  of  a  lack  of  timeliness  or  because  of  ques¬ 
tionable  validity  of  certain  runs.  Since  data  processing  is  the 
beginning  of  confidence  or  lack  of  confidence  (which  cannot  always 
be  quantified)  in  our  final  results,  it  is  an  inherent  part  of 
the  analysis.  The  end  of  data  processing  is  a  run-by-run  data 
summary  that  includes  each  run  made  or  attempted  during  project 
operations.  This  run  summary  gives  the  actual  test  scenario/ threats 
or  test  conditions,  environmental  conditions,  the  various  perfor¬ 
mance  data,  validity  codes,  and  remarks.  Not  only  does  this  run 
summary  provide  the  formal  base  of  subsequent  analysis,  it  will 
usually  be  an  annex  in  the  final  report. 

b.  The  next  series  of  steps  are  basically  data  analysis 
or  statistical  analysis.  First,  the  run-by-run  data  summary 
is  scanned  to  update  the  analysis  techniques  outlined  in  the 
Test  Plan.  Some  columns  of  data  may  not  be  amenable  to  formal 
statistical  analysis;  some  may  be  too  sparsely  filled,  others  may 
be  practically  constant.  For  example,  if  one  column  contains 
only  three  data  points,  it  would  be  better  to  report  it  as  such 
and  not  mislead  the  reader  by  statistical  elegance.  Or  if  sea 
state  varied  only  minutely,  the  analysis  time  would  be  better  spent 
on  other  variables. 

c.  The  first  formal  step  in  statistical  analysis  is  to 
measure  and  adjust  (standardize)  lor  extraneous  effects  such  as 
changes  in  environment,  effects  of  practice,  etc.  Formal  statistical 


FT 


3 

t' 


I 

! 

3 

; 

* 


i 

i*  ; 

► 

(■ 


i 


techniques  ace  used  to  determine  if  a  relationship  does  exist,  what 
the  relationship  is,  and  the  importance  of  the  relationship.  This 

i 

is  the  basis  for  standardizing  (or  normalizing)  the  data  for  further 
analysis.  This  information  is  of  value,  per  se,  and  should  be 
included  in  the  final  report.  Note  that  when  a  technical  relation¬ 
ship  is  available  (inverse  square  law,  etc),  it  should  not  be  used 
unless  the  statistical  results  agree;  the  assumptions  involved  with 
the  law  may  not  be  valid  in  the  project  operations. 

d.  The  next  step  is  to  determine  the  degree  of  synthesis; 
how  should  the  data  for  different  test  scenarios  be  combined, 
if  at  all.  The  approach  is  to  start  with  the  smallest  element,  a 
single  test  scenario  or  test  condition,  and  combine  it  with  as 
Igflany  others  as  is  valid.  The  limitations  to  the  combining  of 

da  fferent  sets  of  data  are  determined  by  formal  statistical  tech¬ 
niques.  If  these  techniques  indicate  significant  differences, 
then  the  data  would  not  be  pooled.  For  example,  suppose  in  pro¬ 
ject  operations  the  classification  was  correct  in  75  of  100 
runs  against  a  diesel  target  and  in  25  of  100  runs  against  a 
nuclear  target.  Pooling  would  give  100  correct  classifications 
out  of  200  runs  (50%).  This  50%  would  be  misleading;  too  high 
in  one  case  (nuclear),  and  too  low  in  another  (diesel).  If 
statistical  techniques  indicate  significant  differences  among 

i 

results  for  different  scenarios,  separate  values  for  MOEs,  CEMs, 
and  MOSs  are  indicated  in  subsequent  analysis  steps.  Tlius,  in 
the  above  illustration,  separate  CEM  values  would  be  determined 
.  ..-'"or  the  two  types  of  targets.  (In  addition,  an  overall  CEM  can 


4-3 


be  determined  by  weighing  the  two  separate  CEMs  by  intelligence 
estimates  of  the  distribution  of  the  two  target  types.) 

e.  The  above  synthesis  pertains  to  combining  various  test 
situations,  but  with  each  type  of  data  analyzed  separately  — 
detection  range  data  are  analyzed  separately  from  classification 
data  and  from  hit/miss  data,  etc.  The  next  synthesis  pertains  to 
combining  various  types  of  results  into  functional  MOEs.  For 
example,  detection  range  results  are  combined  with  "no  contact" 
results  and  the  MOE,  probability  of  detection,  by  range  result  is 
formed.  Or  the  reaction  time  for  radar  detection  is  combined  with 
threat  analysis  time  and  with  decoy  launch  time,  etc.,  into  a  total 
reaction  time  MOE. 

f.  The  next  step  is  to  combine  MOEs  into  CEM,  using  MOEs 
determined  from  project  operations  and  MOEs  from  other  sources,  if 
appropriate . 

g.  The  CEM  is  then  combined  with  the  MOS  results  of  avail¬ 
ability  analysis  and  the  MOMS  value  is  determined.  The  actual 
combining  is  perhaps  by  formulae  or  computer  engagement  model. 

If  the  MOMS  values  are  reported,  the  derivation  used  is  reported, 
as  well  as  some  indication  of  confidence. 

h.  The  next  step  is  to  determine  the  lev..  1  of  confidence  in 
oui  final  results.  Statistical  confidence  can  be  obtained  for 
some  components  used  in  the  CEM  and  MOOS.  However,  there  may  be 
other  components  with  unknown  confidence.  Some  may  be  based  on 
extrapolations ,  some  may  be  based  on  intelligence  estimates,  etc. 
And,  of  course,  there  are  important  aspects  that  are  qualitative'. 


4-4 


Sensitivity  analysis  may  help  in  indicating  the  impact  of  possible 
error  in  some  functional  MOE  components  such  as  intelligence 
estimates.  Our  confidence  in  some  component  values  may  be  so  small 
that  the  scope  of  the  CEM  is  restricted  and  only  a  partial  CEM  is 
reported.  Our  confidence  in  the  CEM/MOS  values  may  be  so  small 
that  the  decision  may  be  made  not  to  report  these  values  as  absolute 
but  only  relative,  if  at  all.  Confidence  in  the  CEM  values,  as 
do  all  confidence  problems,  involves  a  series  of  tradeoffs.  On  one 
hand  we  realize  that  the  formation  of  MOMS  is  based  on  modeling. 


and  the  more  we  model,  the  more  the  danger  of  large  errors.  On 
the  other  hand,  if  we  eliminate  this  type  of  error  and  avoid  the 
MOMS,  we  end  up  with  a  series  of  functional  MOEs.  This  forces  each 


ader/decision-maker  to  integrate  (mentally)  the  details  into  a 


personal  judgment. 


4-b 


rAGE  4 -(j  ,  REVERSE,  BLANK 


Section  5 


Suitability  Considerations 

501.  Scenario  Approach.  The  analysis  thinking  already  presented 

\  ■ 

in  general  terms  applies  to  suitability  also.  Scenarios,  opera¬ 
tional  realism,  test  planning,  etc.,  are  as  important  in  suita¬ 
bility  as  in  effectiveness. 

a.  For  example,  different  scenarios  may  result  in  different 
suitability  measures.  If  a  system  uses  different  hardware  in 
different  modes,  the  suitability  result  for  one  scenario  may  be 
different  from  another.  Suppose  different  failure  rates  pertain 
to  search  mode  hardware,  track/acguisition  hardware,  and  launch/ 
firing  hardware.  The  suitability  measures  would  vary  strongly  by 

^cenario  if  mode  times  vary  by  scenario.  A  complex  system  with 
different  modes/hardware  has  many  suitability  measures. 

b.  The  scenario  approach  to  suitability  requires  an  imagin¬ 
ative  operational  interpretation  of  the  criteria.  For  example, 
an  ESM  processor  may  have  an  MTBF  criterion  of  470  hours.  This 
technical  or  design  criterion,  while  it  is  interesting,  has 
little  operational  meaning.  A  more  meaningful  operational  speci¬ 
fication  is  derived  by  considering  the  planned  fleet  use  of  the 
processor.  The  processor  is  to  be  used  with  a  LAMPS  helo:  90-day 
deployment,  40  sorties  per  month,  a  typical  sortie  being  1.5  hours. 
No  repairs  are  to  be  made  at  sea  except  direct  replacement.  Only 
one  spare  is  available.  Evaluating  operational  suitability  requires 
taking  the  above  into  account  in  forming  the  MOS.  An  operational 
measure  is  having  the  processor  (or  spare)  functioning  without 
-materiel  abort  during  the  entire  deployment.  The  OTD  recommends 


5-1 


setting  the  threshold  probabilities  at  0.95  —  in  other  words, 

95%  of  wartime  deployments  should  be  completed  with  the  processor 
functioning  at  the  end  of  the  deployment.  This  operational  cri¬ 
terion  obviously  goes  beyond  the  simple  technical  measure  of 
MTBF . 

c.  With  respect  to  suitability,  certain  simplifications  in 
analysis  are  possible.  In  many  projects,  different  test  conditions 
need  not  be  imposed  for  suitability  determinations.  For  example, 

a  radar  is  tested,  materiel-wise,  whether  an  actual  target  is 
present  or  not.  The  stress  on  the  radar  is  practically  the  same. 

In  many  projects,  then,  data  on  suitability  are  accumulated  while 
test  conditions  are  being  varied  for  effectiveness  purposes. 

d.  Seldom  can  the  scenario  approach  be  tested  directly  or 
completely.  However,  standard  analysis  techniques  can  be  used 
with  hardware  failure  rates  and  repair  rates  determined  from  at-sea 
testing  to  derive  the  results  expected  in  various  scenarios.  The 
at-sea  testing  is  necessary  to  determine  the  hardware  reliability 
and  then  a  reliability  model  is  made  to  take  the  different 
scenario-dependent  hardware  and  modes  into  account.  Thus,  an 
observed  hardware  failure  at  sea  may  be  critical  in  one  scenario 
but  minor  in  another. 

502 .  Operational  Suitability 

a.  Operational  suitability  encompasses  a  spectrum  of  elements, 
broadly  classified  as  reliability,  maintainability,  supportability , 
etc.  Each  of  these  broad  categories  contains  subelements,  some  of 
which  can  be  quantified,  and  some  of  which  cannot.  The  significanc 


5-2 


~  "Jf  reliability  in  hardware  design,  which  can  be  quantified  in 

various  ways,  and  the  intangible  characteristics  of  operator  pro¬ 
ficiency,  which  currently  defies  quantification,  represent  the 
opposite  ends  of  this  overall  spectrum. 

b.  In  order  to  determine  the  quantitative  measures  of  opera¬ 
tional  suitability,  data  are  accumulated  to  determine  MTBF  (mean 
time  between  failure),  MTTR  (mean  time  to  repair),  MTFL  (mean 
time  to  fault  locate),  MTBM  (mean  time  between  maintenance),  AQ 
(operational  availability),  MSI  (maintenance  support  index),  R 
(mission  reliability),  MMT  (mean  maintenance  time),  etc.  The 
numerical  values  of  these  indicies  are  not  the  only  indication  of 
operational  suitability,  but  they  are  needed  to  relate  the  at-sea 
^jy^ser vat ions  to  the  operational  scenarios. 

503 .  Data  Collection  And  Processing 

a.  Failure  analysis,  an  important  step  in  data  collection, 
is  determining  not  only  what  failed  but  also  why.  Failures  are 
catagorized  as  follows: 


(a)  Design 

(b)  Manufacturing  (quality  control). 

(c)  Personnel  (maintenance,  handling). 

(d)  Non-hardware  (software). 

(e)  Outside  envelope  (stress,  test  conditions). 

(f)  Combination  of  above. 

(g)  Exercise  (non-valid). 

■>.  (2)  Unass ignable  (unexplained,  random) 

;.e  process  of  deciding  the  catagory  of  failure  is  difficult  and 


5-3 


may  lead  to  disagreement  with  the  DA  or  contractor.  OPTEVFCR  has 
a  fleet-assessment  point  of  view.  For  example,  a  failure  may 
or  may  not  be  deemed  to  have  occurred  in  a  realistic  fleet  situa¬ 
tion  (handling  damage  is  in  this  category).  Some  catagories  may 
be  ignored  for  some  reliability  indices,  but  not  for  others.  In 
other  words  each  failure  must  be  examined  critically  and  analysis 
must  be  carefully  done. 

b.  Quantitative  data  relative  to  DOWNTIME  should  be  collected 
on  standard  3M,  MAF,  or  NAMP  forms.  Some  modification  to  the 
instructions  for  completing  these  forms  will  be  required  to  ensure 
that  all  data  are  recorded.  Quantitative  data  relative  to  UPTIME 
should  normally  be  collected  in  an  Operator’s  Log,  Debriefing  Log, 
etc.  Extreme  care  must  be  taken  in  establishing  the  requirement 
for  this  log  to  ensure  that  it  does  not  impact  on  the  performance 
of  the  operator.  Automatic  data  recording,  if  available,  takes 
precedence,  and  portable  voice  and  video  tape  recorders  should 

be  considered. 

c.  Qualitative  data  consist  of  comments  entered  on  data  col¬ 
lection  forms  plus  questionnaires  and  debriefs  following  project 
operations.  In  some  cases,  these  qualitative  data  may  be  par¬ 
tially  quantified  to  the  extent  of  identifying  the  number,  back¬ 
ground,  etc.,  of  respondents. 

S04.  Rel  labi  lity .  "The  probability  that  ar,  item  will  perform  its 
intended  function  for  a  specified  interval  under  stated  conditions." 
Operationally  it  is  usually  expressed  as  a  probability  of  completing 
an  engagement  without  a  failure. 


a.  Because  systems  under  test  are  frequently  complex  and 
contain  many  components  or  electronic  circuits,  operating  time 
under  full  stress  is  the  important  measurement.  Excluding  wear-m 
and  wear-out,  the  failure  rate  is  usually  stable.  The  steady  state 
situation,  constant  failure  rate,  leads  to  the  use  of  the  exponen¬ 
tial  distribution.  This  is  used  as: 

t 

'MTBF 

R  =  e 

where  R  =  probability  of  completing  an  engagement  of  time  t  without 
a  failure. 

b.  MTBF  in  the  above  equation  is  the  operational  test  time 
divided  by  the  number  of  failures.  How  the  failures  are  counted 
'influences  the  operational  interpretation  of  R.  For  example,  if 
MTBF  is  based  on  counting  only  failures  that  cause  miss ion- abort, 
then  R  is  the  probability  of  completing  an  engagement  without 
mission-aborting  failure.  The  observed  MTBF  is  usually  considered 
as  a  best  estimate  from  which  confidence  estimates  or  predictions 
can  be  made  using  statistical  techniques.  MTBF  is  usually  deter¬ 
mined  by  dividing  operating  time  by  the  number  of  critical  (mis¬ 
sion-aborting),  and  major  (mission-degrading)  failures.  Minor 
failures  are  usually  ignored  in  determining  operational  relia¬ 
bility  (and  for  determining  most  maintenance  measures). 

c.  In  many  projects,  reliability  pertains  to  a  combination 
of  software  and  hardware.  From  an  analysis  point-of-view  it  is 
best  to  separate  software  and  hardware  failures,  and  later  combine 

-both  into  a  combined,  total  picture.  The  primary  reason  for  initia 


:paration  of  software  and  hardware  is  related  to  maintenance 
id  logistics.  It  may  take  seconds  to  "restore11  a  software  failure, 
lile  it  may  take  hours  to  correct  a  hardware  failure.  While 
iparation  is  desirable,  it  may  be  difficult  to  pin  down  a  failure 
;  software.  To  avoid  misinterpretation  we  use  the  expression 
lardware  failures"  ana  "non-hardware  failures." 

d.  Failure  is  not  always  dependent  on  operating  time.  Reli- 
>ility  of  one-shot  devices  (where  the  outcome  of  a  test  of  the 
;vice  can  be  classified  as  a  success  or  failure)  is  measured  as 
le  proportion  of  success  to  total  number  of  trials. 

e.  Reliability  evaluations  may  also  contain  a  qualitative 
ssessment  regarding  any  aspects  of  design,  workmanship,  installa- 
on,  or  operations  that  affect  reliability. 

f.  In  projects  where  the  system  under  test  has  more  than 

le  mode  of  operation  or  mission,  it  is  important  to  report  the 
liability  of  each  mode  or  mission  rather  than  one  overall  figure, 
ita  must  be  carefully  analyzed  to  insure  correct  application  to 
■par ate  modes  or  missions.  Complex  systems  may  require  other 
urns  of  analysis  such  as  modeling  or  failure  rate  weighing  to 
count  tor  a  utilization  factor  that  differs  from  the  one 
peaenced  m  testing. 

5  .  Ma in t.-i  triad  1 1  ity 

a.  An  assessment  of  the  effort  or  work  required  to  keep  a 
stem  in  a  state  of  readiness  during  a  deployment,  the  maintam- 
ility  assessment  can  be  expressed  in  many  ways,  with  most  of 
e  common  measures  using  time: 


‘j-L 


(1)  MTTR  (mean  time  to  repair). 

(2)  MTFL  (mean  time  to  fault  locate). 

(3)  MTBPM  (mean  time  between  preventive  maintenance). 

(4)  MMHPFH  (mean  man-hours  per  flight  hour). 

(5)  MSI  (maintenance  support  index). 

Most  of  these  classic  measures  should  be  translated  to  operational 
terms.  For  example,  using  the  length  of  a  deployment,  demands 
on  the  system,  MTBF,  and  MTTR,  one  can  make  a  statement  such  as 
"During  a  typical  30-day  deployment,  the  system  will  be  under¬ 
going  maintenance  16%  of  the  time." 

b.  Failures  trigger  the  time  measurements  discussed  above. 
Operationally,  failures  trigger  another  action--the  effort  to 
-vercome  the  impact  of  the  failure  and  continue  the  engagement. 

The  time  it  takes  to  shift  to  a  backup  system,  our  secondary  mode 
of  operation,  in  order  to  complete  the  mission  is  not  necessarily 
related  to  the  time  it  takes  for  actual  repairs,  and  it  should 

be  noted.  For  example,  a  blown  circuit  in  a  computer  power 
supply  may  take  a  few  seconds  to  restore,  but  it  may  take  hours  to 
regenerate  the  automatic  display  of  tactical  information. 

c.  Maintainability  evaluations  also  contain  a  qualitative 
assessment  regarding  aspects  of  design,  installation,  or  opera¬ 
tion  that  affect  maintenance.  In  some  cases,  systems  perform  with¬ 
out  a  failure  or  with  few  failures  during  an  evaluation  --  yielding 
little  or  no  data  on  maintainability.  To  ensure  that  data  are 
obtained,  always  make  provisions  for  a  backup  maintainability 
demonstration  (using  prefaulted  modules  for  example)  so  Chat  the 
maintainability  assessment  need  not  be  entirely  qualitative. 


d.  The  purpose  of  calculating  MTTR  is  to  measure  the  cor¬ 
rective  maintenance  time  involved  in  restoring  to  a  fully  opera¬ 
tional  status  a  system  that  has  failed  or  is  significantly 
degraded.  Corrective  maintenance  time  is  the  time  during  which  one 
or  more  personnel  are  working  on  a  critical  or  major  failure  to 
effect  a  repair.  Corrective  maintenance  time  includes:  prepar¬ 
ation,  fault  location,  part  procurement  from  local  (on-board) 
sources,  fault  correction,  adjustment/caiibration,  and  follow-up 
checkout  times.  It  excludes  supply  and  administrative  time.  In 
calculating  MTE'L,  the  purpose  is  to  arrive  at  a  measure  of  the 
difficulty  of  troubleshooting  the  equipment. 

e.  Experience  has  indicated  that  repair  times  and  fault  locate 
times  are  not  statistically  normal.  A  few  repair  actions  take 

an  extremely  long  time  compared  to  the  bulk  of  the  actions.  An 
arithmetic  mean  of  such  data  is  not  representative.  Statistically 
this  is  handled  by  calculating  the  geometric  mean.  The  geo¬ 
metric  mean  is  more  useful  in  summarizing,  predicting,  and  estimating 
such  times  as  repair  and  fault  locate.  The  geometric  mean  is 
obtained  by  averaging  the  logarithms  of  the  time  data  and  then 
finding  the  anti  log. 

f.  Another  significant  index  of  maintained  1 ity  is  MSI, 

a  measure  of  the  number  of  man-hours  of  active  maintenance  time 
(preventive  and  corrective)  required  for  each  hour  of  equipment 
operating  time. 

g.  For  systems  with  built-in  test  equipment,  tne  maintain¬ 
ability  assessment  is  not  complete  without  an  evaluation  or  the 
adequacy  of  this  inn  i  nf  enance  tool.  Some  useful  numerical  indices 


are  false  alarm  rate,  degree  of  coverage,  and  percentage  of  cor¬ 
rect  isolations.  If  data  are  not  available,  a  qualitative  assess¬ 
ment  of  the  test  equipment  may  be  made  through  questionnaires  and 
interviews  with  maintenance  personnel  concerning  the  usefulness, 
adequacy,  and  the  confidence  they  have  in  the  output  of  the  equip¬ 
ment  . 

506 .  Availability 

a.  Aq  (operational  availability)  best  expresses  the  pro¬ 
bability  that  the  weapon  system  will  be  ready  when  needed  to 
engage  the  enemy.  During  a  deployment,  the  interval  of  time 
during  which  the  system  may  be  called  upon  to  carry  out  an  engage¬ 
ment  is  called  engagement  demand  time.  A  system  is  available 
(during  this  period  if  it  is  operating,  in  standby,  or  off  and  can 
be  brought  on-line  within  an  acceptable  delay.  A  system  is  not 
available  if  it  is  "hard  dov/n,"  that  is  being  repaired  or  in 
preventive  maintenance  and  cannot  be  restored  tb  operation  in 
time  to  meet  the  threat. 

b.  Aq  is  expressed  as  the  percentage  of  engagement  demand 
time  during  which  the  system  can  engage  This  quantity  is  the 
availability  that  is  most  meaningful  to  a  Fleet  Commander  anc  is 
what  may  be  expected  in  an  operational  environment .  hQ  takes 
into  account  ail  maintenance  actions  as  veil  as  delays  awaiting 
procurement  of  repair  parts.  Modifications  to  AQ  may  exclude 
standby  or  off- time  from  the  calculations. 

507.  Logistics  Supportability .  To  satisfy  the  OT&E  requirement 
to  assess  logistics,  it  is  not  sufficient  to  determine  that 
logistics  support  will  be  m  accordance  with  some  established 


5-9 


t 


L 


Navy-wide  system.  As  an  OTD,  you  are  in  a  position  to  forestall 
the  totally  unsatisfactory  situation  of  introducing  a  weapon  sys¬ 
tem  to  the  fleet  that  is  not  supportable  at  the  time  of  its 
introduction. 

a.  Systems  designated  for  test  and  evaluation  and  installed 
in  a  fleet  unit  are  usually  supported  by  a  package  of  spares 
assembled  by  the  manufacturer.  This  package  does  not  necessarily 
represent  the  on-board  spares  of  the  ultimate  installation.  Use 
of  the  spares  from  the  package  assembled  by  the  manufacturer 
must  be  carefully  monitored.  Further,  the  type  and  number  of 
spares  actually  used  must  be  compared  to  the  APL  (Allowance  Parts 
List)  if  available. 

b.  Several  qualitative  areas  should  be  examined  when  assessing 
logistics  supportability .  These  included: 

(1)  The  availability  and  adequacy  of  special  test  equipment, 
tools,  and/or  handling  equipment. 

(2)  The  need  for  continued  contractor  support. 

(3)  The  requirement  for  any  special  test  or  maintenance 
facilities . 

c.  Software  requires  support,  too.  As  with  mechanical  sys¬ 
tems,  problems  (program  errors)  can  be  reported  by  message; 
unlike  mechanical  systems,  repair  (a  program  change)  can  be  trans¬ 
mitted  by  message  too.  If  the  weapons  system  xnvolves  software, 
include  software  support  of  in  your  assessment. 

d.  LFA  (logistics  factor  of  availability)  is  a  numerical 
measure  of  the  quality  of  logistics  support.  LFA  is  defined  as: 


b- 10 


LFA  -  1 


D 

Ct 


where  Dc.  is  the  time  spent  awaiting  delivery  of  spares 
from  beyond  the  unit 

Dq  is  all  other  down  time 

U  is  up  time  or  demand  usage  time 

LFA  is  the  fraction  of  time  that  we  are  not  waiting 
for  external  spares.  Conversely,  1  -  LFA  is  the 
fraction  of  time  that  was  occupied  by  logistic  delay. 
LFA  is  so  defined  that  if 


A  =  U 

U+D  +D 
s  o 

and  A,  is  the  availability  with  perfect 
logistics,  i.e: 


A 


1 


then 


U 

U+Do- 


A  =  (A, )  (LFA) 

O  J. 

ft.  The  availability,  adequacy,  and  accuracy  of  all  technical 
information  required  for  use  and  maintenance  of  the  system 
under  test  is  evaluated  under  the  heading  of  technical  documen¬ 
tation.  Required  technical  documentation  usually  includes  opera¬ 
ting  and  maintenance  manuals  and  PMS  documentation  including 
MIPs  and  MRCs.  Current  procurement  policies  do  not  require  final 


documentation  prior,  *  o  the  production  decision.  However,  some 
form  of  maintenance  manual  will  be  needed  to  support  an  OPSVAL. 


The  manuals  should  be  assessed  to  the  extent  possible  with 
respect  to  accuracy  and  usefulness  to  the  operator  and  maintenance 
technician.  Inaccuracies  or  lack  of  required  information  should 
be  reported.  Preliminary  technician  manuals  should  be  evaluated 
using  the  standards  required  of  the  final  manuals.  If  draft  or 
preliminary  MRCs  are  available,  they  should  be  evaluated  for 
applicability  and  accuracy  as  if  they  were  the  final  issue.  In 
some  new  systems,  the  number  of  operating  modes  and  combinations 
of  modes  may  represent  a  marked  departure  from  those  previously 
available.  The  question  of  how  to  set  the  system  up  for  optimum 
performance  under  different  environmental  conditions  is  an  impor¬ 
tant  one.  When  dealing  with  such  a  system,  the  Developing  Agency 
should  be  tasked  with  providing  an  Operating  Guideline,  which  is 
usually  based  on  technical  information  and  theory.  Inaccuracies 
in  the  technical  information  or  Operating  Guideline  are  evalu¬ 
ated  under  Technical  Documentation;  recommendations  for  better 
methods  to  employ  the  system  fall  logically  in  the  Tactics  cate¬ 
gory. 

506.  Compatibility .  Compatibility  includes  the  effects  of  the 
environment  on  the  equipment  and  effects  of  the  equipment  on  the 
environment.  When  addressing  compatibility,  the  system  can  be 
evaluated  from  these  points  of  view:  physical,  functional,  and 
electrical/electronic/acoustic.  The  following  outlines  what  to 
look  for. 

a.  Physical  environment  considerations  include  the  effect 
of  such  factors  as: 


5-]  2 


(1)  Climate  and  weather  conditions  including  wind, 

temperature  and  humidity. 

(2)  y-.'tioii  in  various  sea  states  or  heavy  air  turbulance. 

(3)  Maneuvers  at  high  speeds  and  at  critical  speeds. 

(4)  high  "G"  loading  or  altitude  extremes  for  aircraft 

sys  ten s . 

(S;  i.epth  and  pressure  extremes  for  underwater  systems. 

{(:)  Sea  wf.ter  temperature,  pressure,  and  solidity. 
v/i  icing  and  saltwater. 

( 8 )  Shock  and  vibration  from  machinery,  gunfire,  or  ship 

speed . 

(9)  Ecology  requirements. 

f  b.  Functional  environment  considerations  include  the  effect 
of  such  factors  as: 

(1)  Size  and  weight  including  ancillary  units  and  cabling. 

(2)  Handling  and  stowage  requirements,  including  those 

for  ace essoi ies . 

(3)  Anti! eat  room  temperature,  ventilation,  exhaust,  and 

air  supply  requirements. 

(4)  Internal  equipment  air  or  water  coding  requirements. 
(9)  Mechanical  and  electrical  interfaces  and  interconnec¬ 
tions  v<xt  h  other  equipments. 

(a)  Fre-ruy  requirements  such  as  power,  voltage,  or  fuel, 
c.  Elect n ca 1 /electronic/acoust i c  environment  considerations 
include  the  extent  and  effect  of  such  factors  as: 

v  (.1)  Fleet  i  M.ni,  radio  frequency,  and  acoustic  interference 

to  and  I  rim  tin:,  systems  in  the  test  ship,  or  weapon. 


3-13 


(?  )  Ma  '  n  power  supply  variations  including  the  effect  of 
power  failure  and  line  frequency  and  voltage  instabilities 
including  voltage  extremes  and  transients. 

509  .  l  nteroperab  i  li  c_y 

a.  Factors  to  consider  include: 

( 1 )  The  ability  to  transfer  information  with  negligible 
distortion . 

(2)  Proper  impedance  matching,  bandwidth,  and  data  rates. 

(5!  F'uid  flo  .  ...n  1  mechanical  linkages. 

(4)  Eiectricai  and  mechanical  loading. 

(5)  Any  degradation  of  performance  of  the  ship  or  aircraft 
or  its  other  systems  that,  result  from  t.iie  installation  or  use  of 
the  system  being  evaluated. 

b.  Inter operabi  j.  Lty  also  concerns  operation  between  the  system 
being  evaluated  on  "own  ship"  and  other  ships,  aircraft,  or  shore 
stations  with  which  it  must  operate.  The  interoperability  point 

of  view  is  large  and  includes  such  additional  considerations  as 
command  and  control  and  mutual  interfaces  (such  as  when  radars 
or  sonars  ou  different  ships  are  operate nc  on  the  same  frequency). 
For  some  projects,  interoperability  will  include  the  capability 
of  operating  with  el  aments  of  the  other  services;  the  Air  Force, 
Army,  Coast  Guard  -w.d  the  Marine  corps. 

c.  Int.acperabi 1 xt  y  also  concerns  human  engineering  and 
human  factors,  i.e.,  man/roach me  interface.  See  your  Analyst  for 
this  impoi rant  ai ea . 


Section  6 


Support 

601.  Guidelines .  This  chapter  delineates  the  various  areas  of 
support,  both  internal  and  external  to  OPTEVFOR.  The  OTD  should 
delegate  as  much  effort  as  possible  in  other  expertise  areas  to 
free  him  to  work  in  his  own  areas  of  expertise:  operational  aspects 
and  team  leadersn-ip .  This  includes  delegation  to  the  Analyst  in 
the  areas  that  the  Analyst  has  expertise. 

bOZ.  Missions/  Xi. rea Ls/ Scenarios/Critena .  As  the  program  develops, 
details  will  become  firmer.  Parts  will  sometimes  be  missing. 

As  the  lack  becomes  critical  to  the  evaluation,  OPTEVFOR  may 
have  to  request  such  guidelines  formally.  If  the  gaps  delay  the 
^"critical  path,"  the  OTD  has  no  recourse  except  to  use  his  best 

P 

judgment  in  filling  up  the  gaps. 

603.  Ins trumentation/bata  Processing.  OPTEVFOR  usually  makes  a 
serious  attempt  not  to  get  involved  with  the  actual  instrumentation 
suite  or  data  processing,  per  se.  Those  areas  are  usually  dele¬ 
gated  to  a  Naval  Laboratory  or  contractor.  OPTEVFOR  should  take 
particular  pains  to  monitor  such  efforts;  however,  OPTEVFOR' s 
responsibility  is  basically  to  determine: 

a.  '^hat  data  we  need,  including  event  and  axis  definitions. 

b.  How  much  data  per  run  we  need. 

c.  How  accurate" y  we  need  the  data. 

d.  In  what  form  we  need  the  data. 

e.  When  we  need  the  end-product:  the  run-by-run  data  spread 


sheets . 


a 


Our  experience  in  these  areas  (see  paragraph  302)  indicates  the 
value  of  early  dry  runs  to  "proof"  instrumentation  in  time  for 
corrective  actions. 

604 .  Simula tion/Gaming/Modeling 

a.  There  is  a  spectrum  of  simulations,  varying  from  paper 
studies  to  at  sea  testing.  Figure  6-1  gives  this  spectrum  and 
corresponding  uses  by  OPTEVFOR.  This  is  a  long-lead  area. 

OPTEVFOR ' s  responsibility  is  early  identification  of  the  extent 
of  need  and  to  recommend  ways  to  meet  the  need. 

b.  Besides  the  Program  Manager,  there  are  many  groups 
within  the  Navy  having  specific  knowledge  and  expertise  that 
ought  to  be  tapped  by  OPTEVFOR.  In  particular,  OP-96,  CNA,  and 
ONR  should  be  checkpoints  for  every  system  program  in  which 
OPTEVFOR  becomes  involved.  They  frequently  have  conducted  studies 
directly  applicable  to  the  system  in  question,  or  can  direct  us 

to  the  agencies  who  have.  (Conversely,  for  their  studies  they 
often  have  need  of  system  and  operational  parameters  that  could 
be  furnished  by  us.) 

c.  Even  if  a  simulation  is  available,  contacts  should  be 
made  early.  Modifications  may  have  to  be  made  to  make  it  more 
operational,  etc.  In  the  modeling  area,  OPTEVFCR  takes  particular 
pains  to  monitor  such  modeling  rather  than  become  technical  matter 
experts . 

d.  In  the  actual  use  of  the  simulation,  specific  tests  are 
given  in  the  Test  Plan.  These  test  may  include: 

(1)  Sensitivity  studies. 


* 


Displays,  man 
in  loop  mock- 
ups,  trainer 


(2)  Rehearsal  of  project  operations. 

(3)  Validation  of  simulation. 

(4)  Augmentation:  project  operations. 

(5)  Real  threats,  real  areas. 

(6)  Countermeasures . 

(7)  Measure  of  mission  success. 

(B)  Tactics  Improvement. 

605.  Pr eject  Analysis 

a.  The  OTD  is  expected  to  use  his  Analyst  before,  during,  and 
after  project  operations.  As  a  vital  team  member,  the  Analyst 
has  many  functions  to  perform.  Some  "cost"  is  involved  in 
becoming  familiar  with  the  project,  i.e,.  the  Analyst  has  to  be 
thoroughly  briefed  before  he  "earns"  his  keep.  Table  6-1 
includes  the  element  of  familiarization  and  gives  the  analytical 
support  functions  that  the  OTD  can  expect  from  his  Analyst. 

b.  The  Analyst  is  completely  responsible  for  very  few  functions 
per  se.  Note  all  the  "...aid  the  ..."  in  Table  6-1.  This  is  to 
denote  his  being  a  team  member.  In  each  function,  however,  the 
Analyst  is  professionally  responsible  for  all  analytical  and 
statistical  techniques.  In  addition,  he,  as  a  team  member,  is 
professionally  responsible  for  all  analytical  steps,  including 
timeliness . 

c.  The  Analyst  is  m  danger  ot  becoming  an  assistant 

OTD.  For  example,  certain  sections  of  plans  ana  reports  are  bes^. 
prepared  initially  by  an  Analyst.  However,  the  OTD  should  insure 
clarity,  understanding,  etc.  To  meet  deadlines,  the  Analyst  would 
be  templed  to  prepare  more  and  more  of  the  plan  and  report. 


The  Analyst  should  avoid  preparation  of  briefings,  becoming  a 
format  or  report  expert,  etc. 

d.  The  OPTEVFOR  Analyst  is  expected  to  service  many  projects. 
This  automatically  means  outside  support  on  major  evaluations; 
delegating  dS  much  of  the  routine  to  others  as  possible,  while 
using  extreme  care  in  spelling  out  the  specific  support  needs. 

The  plan  for  data  analysis  must  be  in  more  detail  than  usual, 

and  most  critical  is  continuous  monitoring  of  the  effort  at  all 
stages . 

e.  The  OPTEVFOR  Analyst  is  responsible  for  project  work 
done  by  other  analysts  for  OPTEVFOR.  This  includes  early 
determination  that  outside  support  is  needed,  the  amount  and  type 
needed,  the  group  best  fitted,  and  close  monitoring  at  each  and 
every  stage. 

606.  Types  of  Support  Augmentation.  On  large  complex  system 
evaluations  the  question  is  how  better  to  obtain  the  necessary 
support:  technical  agency  or  contractors.  Frankly  the  selection 

is  not  too  critical;  we  have  seen  excellent  work  done  by  both 
groups.  Other  things  being  equal,  the  following  guidelines  apply: 

a.  Tecnnical  Agency.  When  the  support  work  can  best  be  done 
at  the  source  of  the  inputs,  away  from  the  Headquarters/Squadrons/ 
Detachment. 

( 1 )  We  may  have  to  work  closely  with  the  Technical 
Agency  in  the  planning  stage. 

(2)  The  project  operations  are  continuous  rather  than 
intermittent,  at  fording  little  opportunity  for  changing  test 
object i ves . 


b-b 


(3)  There  is  extensive  data  processing. 

(4)  Technical  analysis  and  support  is  necessary. 

( 5 )  The  data  analysis  procedures  can  be  adequately 
reseen. 

(6)  Minor  slippages  in  deadlines  can  be  tolerated, 
b.  Contractors .  When  the  support  work  can  best  be  done 

the  Headquarters/Squadron/Detachment . 

(1)  Numerous  technical  agencies  may  be  involved. 

(2)  The  situation  is  fluid,  leading  to  changes  in 
jectives. 

(3)  Data  processing  is  minimal  or  done  by  a  third 

rty . 

(4)  Technical  analysis  and  support  is  minimal  or  too 

verse. 

(5)  The  data  analysis  procedures  cannot  be  adequately 
reseen. 

(6)  Direct  control  is  necessary  to  insure  concentrated, 
LI- time  effort  to  meet  deadlines: 


Table  6-1 


Generalized  Functions  of  an  Analyst  in  a  General  Project 

Familiarization 

1.  Become  familiar  with  project,  system,  Navy's  requirements. 

2.  Become  familiar  with  conceptual  design  studies. 

3.  Become  familiar  with  simulations,  test  beds. 

4.  Become  familiar  with  current  status,  calendar  of  events. 

5.  Become  familiar  with  software  management  plan. 

Project  Logic 

6.  Aid  OTD  in  preparing  scenarios. 

7.  Aid  OTD  in  determining  or  interpreting  criteria. 

8.  Aid  OTD  in  determining  CEM  and  MOS. 

7.  Aid  OTD  in  the  dendritic  structure  of  objectives. 

10.  Aid  OTD  in  developing  TEMP. 

Long-Lead  Time  Aspects 

11.  Aid  OTD  in  determining  fleet  services  (preliminary). 

12.  Aid  OTD  in  identifying  new  target  needs,  obtaining 
threat  information,  having  simulations  available. 

13.  Aid  OTD  m  determining  instrumentation  needs,  analytical 
support  (preliminary). 

14.  Aid  OTD  in  special  funding  requests. 

Early  Involvement 

16.  Become  familiar  with  manufacturer's  plan,  data,  results. 

16.  Aid  OTD  in  identifying  operational  issues. 

17.  Aid  OTD  in  resolving  operational  issues. 

’8.  Aid  OTD  m  making  early  assessments. 


6-7 


a 


* 

19. 

^.0. 

i 

21. 

22. 

i 

n," 

23. 

24. 

S"., 

► 

25. 

K 

1 

26. 

27 . 

1 1 1* 

28. 

v 

29. 

1 

r  :• 

. 

30. 

m 

31. 

1 

32  . 

33. 

<■ 

34. 

r~ 

b.  •* 

35. 

36. 

37. 

rr 

38. 

39. 

4 

40. 

T 

41. 

OPEVAL  Planning 

Aid  OTD  in  firming-up  OPEVAL  test  objectives. 

Aid  OTD  in  firming-up  OPEVAL  test  allocations,  test 
conditions,  sample  sizes. 

Aid  OTD  in  firming-up  OPEVAL  data  needs. 

Aid  OTD  in  firming-up  analytical  support  requirements. 
Conduct  sensitivity  studies. 

Review  DA  test  plan. 

Aid  OTD  in  merging  DA/COMOPTEVFOR  needs,  allocations. 
Review  expected  services. 

Aid  OTD  in  drafting  Test  Plan. 

Conduct  rehearsal  of  plan. 

Attend  cut  board  on  plan,  help  revise  plan. 

Advise  on  impact  of  changes  during  at-sea  operations. 

After  At-Sea  Operations 

Conduct,  analyze,  report  simulation  validation. 

Review  TECHEVAL  data  processing. 

Monitor  or  conduct  OPEVAL  data  processing. 

Monitor  or  conduct  OPEVAL  data  analysis. 

Monitor  or  conduct  simulation  tests. 

Review  TECHEVAL  report. 

Monitor  or  conduct  effectiveness  and  reliability  modeling. 
Prepare  analysis  results  for  OTD. 

Aid  OTD  in  DSARC  briefings/message  reports. 

Aid  OTD  in  preparing  operating  doctrine,  tactics. 

Aid  OTD  in  defending  report. 


6-8 


Section  7 


Summary 


701.  Purpose  of  OT&E.  The  primary  purpose  of  OT&E  is  to  estimate 
and  predict  the  prospective  system's  operational  effectiveness 
ana  operational  suitability.  This  estimation  is  a  projection  of 
the  test  results  to  some  future  time  when  the  system  will  be  used 
in  the  fleet  against  real  targets.  A  meaningful  projection 
reguires  testing  in  as  realistic  a  manner  as  possible  in: 

a.  Missions  and  scenarios  using  expected  tactics. 

b.  Targets  that  strike  back. 

c.  Production- type  hardware. 


0T 


d.  Typical  personnel  in  rate,  training,  and  number. 

e.  Intended  operating  environment. 


702.  Realism.  To  be  realistic,  we  should  include  as  broad  a  scope 
in  our  evaluation  as  possible.  It  is  more  efficient,  more  valid, 
and  more  useful  to  test  as  many  different  scenarios  or  condi¬ 


tions  as  possible  before  repeating  any.  There  are  many  constraints 
to  complete  realism.  For  example,  in  early  OT&E,  hardware  may  be 
breadbroad,  tactics  information  may  be  incomplete,  etc.  Test 
services  including  number  of  firings,  targets,  etc.  are  so  limited 
that  free-play  must  be  controlled  so  encounters  are  forced. 

Another  typical  constraint  is  instrumentation.  We  cannot 
afford  complete  realism  because  it  prevents  knowing  what  went 
on,  if  we  had  a  valid  opportunity,  etc.  If  we  cannot  reconstruct, 
then  we  are  playing  and  not  evaluating. 

‘73.  Test  Data  In  theory,  the  operational  MOE  in  evaluating  the 
-.11  potential  of  a  weapon  system  is  the  hit/miss  count.  This  is 


7-1 


ideal  if  sufficient  firings  are  available.  However,  test  services 
aie  always  limited.  In  our  work,  hit/miss  counts  and  other  direct 
test  data  must  be  amplified  by: 

a.  Technical,  engineering  examination  of  each  firing. 

b.  Extensive  use  of  supplementary  data  from  non-firing 
tests,  from  functional  component  tests,  etc. 

c.  Use  of  modeling  and  computer  simulations. 

d.  Use  of  expert  judgement 
704 .  Analysis  in  OT&E 

a.  In  OT&E,  the  stress  is  on  evaluation  as  well  as  on  test. 
The  lack  of  complete  realism  coupled  with  limited  test  services 
limit  the  impact  of  direct  testing  and  focus  more  need  for  evalua¬ 
tion.  It  is  this  evaluation  process,  that,  hampered  by  numerous 
constraints,  necessitates  analysis.  This  is  the  basis  for  intense 
interest  and  the  need  for  a  high  order  of  analysis.  The  objec¬ 
tives  of  analysis  are  to  insure  that: 

(1)  In  planning,  we  ask  the  correct  questions. 

(2)  In  testing,  we  use  the  minimum  amount  of  services. 

(3)  In  analysis,  we  answer  the  questions  correctly. 

b.  Analysis  in  OT&E  draws  heavily  on  operations  research 

and  statistics.  Analysis  is  more  an  art  than  a  science  using  sci¬ 
entific  procedures  and  techniques.  With  the  stress  on  operations, 
only  selected  scientific  procedures  are  useful.  Even  these  must 
be  adapted.  In  ail  cases,  the  requirements  must  be  expressed  or 
defined  in  operational  terms.  Even  DOD  standard  terminology  must 
be  interpreted  in  operational  terms. 


c.  An  evaluation  requires  a  thorough,  sometimes  intensive, 
analysis  effort.  The  OTD  has  managerial  responsibility  in  this 
area  —  he  is  not  responsible  for  the  specifics  of  the  analysis. 
The  Analyst  is  responsible  to  provide  the  analytical  approach, 
specifics  of  test  design  and  planning,  professional  techniques 
used,  and  data  analysis.  The  Analyst  may  have  to  direct  others 
in  doing  some  of  this,  such  as  in  data  analysis,  but  he  is  respon¬ 
sible  for  the  output.  The  Analyst  can  only  function  with  inputs 
and  guidelines  supplied  by  the  OTD.  The  best  modus  operandi 
is  to  work  as  a  team,  starting  with  an  initial  discussion  of  the 
mission,  scenarios,  etc.  Then  each  does  what  he  can  do  better 
because  of  his  training,  experience,  etc. 

Li  d.  Data  analysis  first  determines  test  conditions  that  are 

w 

^similar  in  terms  of  results.  These  form  homogeneous  groups. 

Data  within  a  group  may  vary  only  to  a  minor  extent,  while  from 
group  to  group  the  data  may  be  quite  different.  A  cardinal  sin 
in  data  analysis  is  to  use  an  overall  average  of  different 
groups,  based  on  the  number  of  data  points  obtained  during  pro¬ 
ject  operations.  The  proper  way  to  obtain  an  overall  average  is 
to  weigh  each  group  average  with  the  relative  occurrence  expected 
in  realistic  use  of  the  system.  If  such  weights  cannot  be 
obtained,  then  it's  best  to  report  each  group  result  separ¬ 
ately. 

e.  The  skipper  on  the  bridge  is  only  interested  in  whether 
his  firing  hit  or  missed  the  target-  If  it  missed,  he  doesn't 
'."are  that  it  missed  due  to  software  or  hardware  failure  or  poor 


^•-erfoimance . 


So  why  not  just  count  the  hits  and  misses,  and 


7-3 


that's  our  analysis?  If  our  sample  sizes  were  large,  this  would 
be  sufficient.  However  to  squeeze  as  much  information  as  possible 
from  our  small  sample  sizes  requires  detailed  analysis.  Initially, 
it  usually  is  more  fruitful  to  determine  effects  separately. 
(Stresses  are  different,  mechanisms  of  failure  vary,  etc. )  Then, 
the  separate  results  are  combined  based  on  the  scenarios  to  answer 
evaluation  criteria  or  what  the  skipper  on  the  bridge  wants  to 
know . 

705.  MOE  Approach.  The  MOE  approach  is  the  basic  analysis 
approach.  In  OT&E,  this  is  extended  to  include  the  support 
principle.  The  critical  MOE  includes  more  than  the  system  being 
evaluated;  it  includes  that  which  the  system  is  supporting.  For 
example,  in  evaluating  a  new  sonar,  the  sonar  is  viewed  in  terms 
of  the  system  it  is  supporting.  For  a  certain  mission,  the  sonar 
may  support  the  fire  control  system.  The  critical  MOE  would  be 
the  timeliness  and  accuracy  of  torpedo  launch  parameters.  Suppose 
testing  indicated  poor  fire  control  solutions.  Even  if  the  sonar 
per  se  was  technically  fine  and  the  interface  was  what  lead  to  the 
poor  solution,  the  sonar  would  be  deemed  not  operationally 
effective  in  this  mission  because  a  submarine  CO  would  rathei  have 
the  old  system  than  the  new.  Technically  the  interface  may  not 
be  subject  to  test;  it  may  not  be  the  responsibility  of  the  sonar 
program  manager.  However,  we  are  not  evaluating  NAVMAT  or  the 
program  manager.  This  extension  of  the  MOE  is  an  important 
difference  between  developmental  and  operational  viewpoints. 


-  «  7  •  • 

—  '  *'  706.  Operational  Effectiveness  and  Operational  Suitability 

a.  The  likelihood  of  the  system  being  "up"  is  a  major 
factor  in  operational  suitability.  This  factor  is  a  combination 
of  availability  and  reliability. 

( 1 )  Availability  pertains  to  the  weapon  system  being  ready 
when  needed.  (Is  the  aircraft  on  deck  "up"  materiel-wise?  Is 

it  ready  to  start  its  mission?) 

(2)  Reliability  pertains  to  the  weapon  system  completing 
its  mission  without  a  materiel  abort.  (Does  the  aircraft  com¬ 
plete  its  bombing  mission  without  a  materiel  abort?)  Stress  and 
repair  help  differentiate  between  the  two.  Reliability  is 
important  with  full  stress  when  no  repairs  are  possible  during 

.  Availability  is 
e.g.  a  continuously  opera¬ 
ting  radar.  While  the  likelihood  of  completing  a  deployment 
period  without  a  failure  (reliability)  may  be  of  interest,  this 
measure  is  not  so  important  since  the  radar  can  be  repaired. 

b.  Operational  effectiveness  pertains  to  system  performance 
when  the  system  is  "up,"  i.e.,  no  mission-aborting  failures. 

c.  While  effectiveness  and  suitability  are  determined 
separately,  the  evaluation  is  not  complete  until  both  are  combined 
into  an  overall  measure.  This,  MOMS,  includes: 

(1)  The  probability  of  a  system  being  up  when  called 
on . 

(2)  The  probability  of  the  system  remaining  up  throughout 

-  the  mission. 

v  v-S 


yjy^he  stress  --  e.g.,  a  missile  in  flight 
important  when  repairs  are  possible  — 


7-5 


(3)  The  effectiveness  of  the  up  system  in  performing  its 

mission. 

Thus,  operational  questions  like  " .  .  .was  the  target  killed  .  .  . 
are  answered  by  calculation  rather  than  by  an  observed  count  of 
hits  and  misses.  With  small  sample  sizes  the  calculation  method 
is  preferred. 

d.  Analytically  speaking,  we  should  not  penalize  a  "bad" 
run  twice.  That  is,  if  a  missile  misses  the  target  because 
of  effectiveness  or  a  hardware  failure,  it  should  be  counted  as  a 
miss  in  effectiveness  or  reliability  but  not  both.  Again  speaking 
analytically,  effectiveness  and  suitability  are  similar.  The 
principles,  practices,  techniques,  etc.  that  apply  to  one  apply 
to  the  other. 

707.  System-Level  Testing.  Operational  evaluation  includes  as 
much  of  the  entire  system  as  possible.  For  example,  missile  eval¬ 
uations  should  include  some  warhead  firings  against  "real"  tar¬ 
gets.  Tradeoff  analyses  usually  indicate  that  very  few  warhead 
firings  are  worthwhile  considering  the  loss  of  targets,  loss  of 
exercise  data,  etc.  Regardless,  experience  has  indicated  the 
value  of  at  leapt  one, complete  system  check. 

708.  Mission  Orientation  in  Testing.  The  stress  on  operational 
measures  in  paragraph  705  requires  more  than  event-by-event  or 
one-on-one  testing.  Testing  should  relate  to  at  least  a  whole 
engagement,  and  ofttimes  to  a  deployment  or  a  blue/gold  cruise. 

For  example,  suppose  a  new  helo  was  being  evaluated  in  an  amphib¬ 
ious  operation.  Rather  than  having  the  MOMS  pertain  to  having 
the  helo  complete  one  sortie  (ship  to  shore  to  ship),  the  measure 


should  include  completing  the  required  number  of  sorties  to 
conduct  the  amphibious  mission  with  a  typical  number  of  helos. 

709 .  Responsibilities  in  OT&E 

a.  COMOPTEVFOR  is  the  Navy's  only  OT&E  agency;  thus  we  are 
the  only  ones  co  plan  and  report  on  OT&E.  While  others  may  con¬ 
duct  the  project  operations  according  to  our  test  plan,  we  are 
in  charge.  OT&E  is  our  responsibility. 

b.  While  we  are  the  Navy's  experts  in  how  to  test  opera¬ 
tionally,  certain  inputs  that  are  critical  to  operational  testing 
and  evaluation  should  be  determined  for  us  by  CNO.  These  include 
missions,  scenarios,  preliminary  tactics,  expected  threats,  fleet 
needs,  and  operational  requirements  and  criteria. 

i  c.  Working  with  these  inputs,  operational  analysis  can  be 
* 

most  beneficial  in  OT-O  and  OT-I  to  ensure  military  usefulness  in 
the  final  product.  This  has  two  facets: 

(1)  Insurance  as  to  military,  operational,  and  combat 
inputs  early  in  system  development. 

(2)  Risk  estimations  and  determination  as  to  the  success 
of  the  development  at  each  phase  beginning  with  OT-I.  Growth 
expectations,  risk  methodology,  etc.  are  included.  The  analysis 
effort  for  the  TEMP  is  critical.  The  analysis  logic  of  test 
phasing,  use  of  long-lead  items  such  as  digital  simulation,  MOMS 
statements,  etc.,  all  have  far-reaching  effects, 

d.  COMOPTEVFOR  is  charged  with  being  independent.  It  is 
obvious  why  we  should  be  free  of  DA  ties.  This  independence 
Also  pertains  to  the  user,  the  fleet.  We  must  refrain  from  being 
"in  bed"  with  the  fleet.  We  must  be  objective  in  our  appraisal. 


7-7 


The  user  would  always  want  a  proposed  system  that  is  an  improve¬ 
ment.  However,  trade-off  analysis  may  indicate  that  the  improve¬ 
ment  is  marginal  compared  to  other  proposed  systems,  etc. 

710 .  The  Stress  in  OT&E 

a.  The  stress  in  OT&E  is  correctly  focused  on  combat,  on  the 
hot  war  situation.  While  this  is  the  primary  consideration,  the 
following  should  also  be  an  important  part  of  the  operational 
criteria: 

(1)  Cold  War.  If  cold  war  functions  are  different 
from  those  of  a  hot  war  (usually  less),  this  difference  should 

be  taken  into  account.  The  criteria  (e.g.,  for  detection,  acqui¬ 
sition,  and  classification)  should  reflect  a  double  use:  cold 
as  well  as  hot  war. 

(2)  Fleet  Firings.  The  evaluation  of  an  advanced  torpedo 
stressed  the  fact  that  there  would  be  weekly  training  firings  when 
the  system  was  introduced  into  the  fleet.  Retrieval,  refurbishment, 
etc.  became  critical  evaluation  aspects. 

b.  We  should  not  be  content  with  preparing  for  the  next  con¬ 
flict  by  reliving  the  last.  Warfare  is  basically  a  history  of 
change.  If  we  put  all  our  eggs  in  one  basket,  we  are  playing  a 
dangerous  game.  The  concept  of  a  weapon  mix  is  a  useful  considera¬ 
tion  in  our  approach. 

c.  Reduced  manning  is  the  modus  operandi  in  the  Navy  --  only 
m  extreme  circumstances  will  a  prospective  system  be  deemed  ef¬ 
fective  and  suitable  if  it  requires  an  increase  in  ship  manning. 

The  reality  of  reduced  manning  must  be  stressed  in  our  evalua¬ 
tions  . 


7-8 


d.  Certain  elements  are  not  pertinent  to  OT&E.  For  example, 
design  or  contract  specifications  are  used  by  the  DA  as  the  basis 
for  contract  compliance  and  for  certification  of  readiness  for 
OPEVAL  —  they  are  not  criteria  in  OT&E.  Cost  analysis  in  a  formal 
sense  is  not  part  of  OT&E  —  equipments  are  cost-examined  during 
concept  formulation  and  contract  definition,  not  during  OT&E. 

e.  OT&E  has  as  an  end  product  a  report.  This  report  must 

be  accurate,  valid,  timely,  etc.  —  and  creditable.  Not  only  must 
we  be  right,  but  we  must  also  convince  others  that  we  are  right. 

(1)  Whatever  we  do  must  be  carefully  documented. 

(2)  Measurements  must  be  quantitative  if  at  all  possible. 
However,  quantitative  measurements  are  seldom  sufficient  —  they 
|*ist  usually  be  supplemented  by  qualitative  elements  based  on 

Tr 

operational  experience.  Qualitative  results  are  the  weakest  link 
in  the  evaluation  chain.  To  strengthen  this  link  we  use  a  high 
degree  of  structure:  specific  questionnaires  in  a  test  plan, 
debriefings,  etc.  More  important,  we  broaden  the  base  of  our  results 
as  much  as  possible.  "In  the  opinion  of  the  OTD"  has  less  impact 
that  "all  qualified  on-scence  observers,  including  representatives 
of  the  DA  and  the  contractor  agree..." 

f.  OPTEVFOR's  involvement  with  acquisition  programs  has 
indicated  that  most  programs  tend  toward  success-orientation,  with 
overly  optimistic  schedules  and  procurement  plans.  Software 
development  is  often  a  major  problem  area;  experience  has  shown 
that  at  least  a  third  of  the  effort  in  software  development  will 
'oe  in  testing  and  correcting  system  integration  deficiencies.  If 
Chis  effort  is  not  budgeted,  the  evaluation  will  be  in  difficulty. 


To  determine  the  degree  of  success-orientation  in  a  program,  the 
program  structure  and  the  following  should  be  examined: 

(1)  Software  Management  Plan. 

(2)  Human  Engineering  Program  Plan. 

(3)  Integrated  Logistics  Support  Plan. 

(4)  Training  Plan. 

Reviews  of  these  plans  in  early  stages  of  a  program  can  be  of 
great  benefit. 


Section  8 


GIol  ary  Of  Special  Analytical  Terms 


Accuracy .  Refers  to  the  deviation  of  a  result  from  the  "true" 
value.  In  a  broad  sense,  accuracy  refers  to  the  validity  of  our 
report  in  predicting  fleet  results.  This  is  not  the  same  as 
precision. 

ANOVA  (Analysis  of  Variance) .  A  technique  useful  in  analyzing 
multi-variable,  multi-setting  factorials. 

Bayesian.  Refers  to  a  formal  incorporation  of  prior  information, 
to  reduce  the  amount,  of  testing.  The  validity  of  this  approach  i; 
still  questioned  by  many  analysts. 

Binomial.  The  jargon  for  hit/miss,  yes/no  count-type  data  with 
.:i:wo  categories. 

t? 

By-Directior.al .  A  method  of  testing  in  sens  of  two  runs,  the 
second  of  the  pair  geometrically  opposite  in  direction  from  the 
first  of  the  pair,  to  balance  out  effects  of  wind,  current,  etc. 
CEM  (Combat  Effectiveness  Measure).  A  measure  of  effective¬ 
ness  specifically  pertaining  to  system  performance  in  completing 
its  mission  assuming  the  hardware  is  "up". 

CEP  (Circular  Error  Probable).  A  summary  measure  of  bombing  or 
backing  accuracy  defined  as  a  circle,  centered  at  the  aimpoint 
or  mean  point  ol  impact,  to  ^nciude  50%  of  the  data.  The  median 
of  radial  errors  as  one  ot  many  ways  to  determine  CEP. 

Chi  Square  (x  )  0i stribution .  A  particular  type  of  sampling 
distribution.  Tabular  values  are  useful  in  analyzing  count-type 


2/2 


AD-A124  194  PROJECT  ANALYSIS  GUIDE(U)  OPERATIONAL  TEST  AND 
EVALUATION  FORCE  NORFOLK  VA  83  FEB  81 
C0H0PTEVF0R-INST-2968.  8 
UNCLASSIFIED  F/G  5/2  NL 


R 


microcopy  resolution  test  chart 

NATIONAL  BUREAU  Of  STANDARDS- 1963-A 


Conditional  Probability.  The  probability  that  an  event  will  be 
successful,  given  that  the  previous  event  in  the  functional  pro¬ 
cess  or  chain  has  occurred. 

Confidence  Coefficient.  The  chance  that  a  confidence  interval 
has  of  including  the  true  value. 

Confidence  Interval.  An  interval  that  has  a  designated  chance 
(the  confidence  coefficient)  of  including  the  true  value. 

Confidence  Limits.  The  end  points  of  a  confidence  interval. 
Confounding.  Refers  to  results  that  cannot  be  attributed 
to  a  particular  single  variable  or  cause. 

Contingency  Tables.  An  analytical  technique  useful  to  determine 
significant  effects  with  count-type  data. 

Correlation  Coefficient.  An  analytical  measure  of  how  well  changes 
in  one  variable  are  concurred  in  with  changes  in  another  varia¬ 
ble.  Usually  denoted  by  the  symbol  r. 

Correlation  Index.  The  square  of  the  correlation  coefficient 
2 

(r  ).  The  index  gives  the  ratio  of  the  variation  explained  by 
the  independent  variable  to  the  total  variation  in  the  dependent 
variable.  This  is  a  measure  of  efficiency  of  fit. 

Data.  Refers  to  the  basic  outputs  of  data  processing  that  are 
subject  to  analysis.  Data  may  be  continuous,  quantitative  type 
or  count  (hit/miss)  type. 

Degrees  of  Freedom.  In  data  analysis,  usually  sample  size  less 
one  or  the  number  of  test  settings  less  one. 

DEP  (Deflection  Error  Probable) .  A  measure  of  bombing  or  tracking 
accuracy  in  deflection,  it  is  the  interval  centered  at  the  aimpoint 
or  mean  point  that  will  include  50%  of  the  errors  in  the 


deflection  axis. 


Dependent  Variable.  The  test  data  or  variable  affected  by  the 
test  or  independent  variables. 

Error  of  the  First  Kind.  The  alpha  error  (a)  of  rejecting  a  good 
system.  Attributing  a  difference  to  results  when  in  reality  there 
is  no  difference.  Also  called  "Type  I"  error. 

Error  of  the  Second  Kind.  The  beta  error  (p)  of  accepting  a  poor 
system.  The  chance  of  missing  an  important  difference  in  results. 
Also  called  "Type  II"  error. 

F  Distribution.  A  particular  type  of  sampling  distribution. 

Tabular  values  are  useful  in  the  analysis  of  variance  technique. 
Factorial  Experiment.  A  test  design  (matrix)  in  which  settings  of 
ach  test  variable  are  tested  with  all  settings  of  every  other 
variable  and  combinations. 

Fractional .  A  test  design  in  which  only  -  selected  partial 
of  a  full  factorial  matrix  is  tested,  depending  on  the  absence 
of  certain  interactions. 

Function/Variable  Chart.  An  analytical  process  to  determine  which 
variaoles  should  be  tested  by  which  function  in  a  system  evaluation. 
Geometric  Mean.  The  mean  of  a  set  of  data  that  has  been  transformed 


to  logarithms,  the  mean  of  the  logs  found,  and  the  anti-log  found 
of  this  result.  Useful  in  analysis  of  detection  ranges,  reaction 
time,  or  repair  time  data. 

Histogram.  A  bar  diagram  representing  a  frequency  distribution. 
Independent  Variables.  Causal  variables,  usually  controlled 
at  certain  test  settings.  These  are  the  variables  that  affect 
che  dependent  variables,  the  test  data. 


Interaction.  The  tendency  for  the  test  data  to  be  dependent 
on  the  combination  of  two  or  more  variables  and  to  give  a  result 
different  from  the  sum  of  the  individual  contributions. 

Lateral  Range.  A  summary  measure  of  performance  obtained  by  first 
transcribing  the  area  of  frequency  of  occurrences  into  a  step 
(all  or  none)  function.  The  value  at  the  step  is  the  lateral 
range.  Sweep  width  is  twice  this  range. 

Latin  Square.  A  test  design  useful  when  three  test  variables 
do  not  interact. 

Least  Squares  Method.  A  method  of  fitting  a  line  that  minimizes 
the  sum  of  squares  of  deviations  from  the  fitted  result.  Used  in 


regression. 


A  popular  measure  of  central  tendency,  the  centroid 


of  a  frequency  distribution.  A  parameter  in  the  normal  distribu¬ 
tion.  Also  called  the  arithmetic  mean  or  average. 

Mean  Square.  Measures  the  quantitative  importance  of  each  effect 
in  analysis  of  variance.  Also  the  variance  or  square  of  the 
standard  deviation. 


Measurements .  Refers  to  the  large  quantity  of  positional,  etc., 
measurements  taken  that  are  combined  with  other  sets  by  data  pro¬ 
cessing  into  test  data. 

Median.  A  measure  of  central  tendency  defined  as  the  value  of  the 
middle  datum  when  the  data  are  monotonically  arranged. 

MOE  (Measure  of  Effectiveness).  A  numerical  measure  of  how  well 
a  task  is  done  or  an  objective  is  met,  a  generic  term. 

MOMS  (Measure  of  Mission  Success).  A  measure  that  combines  measures 
of  performance  and  suitability. 


taining  to  system  availability,  reliability,  and  maintainability 
in  completing  its  mission. 

Non-Par ame trie .  An  analytical  approach  that  does  not  assume  a  par- 
ticaxar  distribution  as  a  basis  for  the  analysis  techniques. 

Normal .  A  type  of  probability  distribution  that  is  usually  due  to 
a  multitude  of  variables,  each  small  in  effect.  Can  be  logar¬ 
ithmic  normal  when  effects  are  relative.  Also  called  Gaussian. 
Normal  Probability  Paper.  A  type  of  graph  paper  scaled  so  that 
a  cumulative  normal  distribution  will  plot  as  a  straight  line. 

Null  Hypothesis.  A  tentative  hypothesis  that  there  is  no  dif¬ 
ference  among  conditions,  which  is  then  tested  analytically  for 

^significance. 

W 

One-at-a-Time  Approach.  A  test  design  that  first  studies  one 
variable  completely,  then  another,  etc.,  in  sequence.  Interest 
centers  on  a  few  particular  conditions  that  cannot  be  integrated. 
One-Tail  Test.  A  test  of  significance  when  the  alternate  to  the 
null  hypothesis  includes  only  one  critical  value,  either  less 
than  or  more  than,  but  not  both  as  in  the  two-tail  test. 

Operating  Characteristic  Curve.  The  curve  that  gives  the  prob¬ 
ability  of  acceptance  as  a  function  of  sample  size  and  the  true 
value  of  true  answer. 

Performance  Measure.  A  summary  measure  based  on  combining  all 
the  individual  data  available  for  the  same  type,  i.e.,  dependent 
variable . 


Population.  An  analysis  concept  representing  the  totality  or 
universe  of  test  conditions,  pieces  of  equipment,  fleet  opera¬ 
tors,  etc.,  that  are  sampled  during  project  operations. 

Precision.  A  measure  of  the  variation  of  the  data  among  them¬ 
selves,  usually  as  determined  by  the  standard  deviation  among 
the  data  from  the  mean  value.  The  smaller  the  standard  deviation, 
the  better  the  precision. 

Probability  Distribution.  A  distribution  of  relative  frequencies 
(based  on  large  samples). 

Radial  Error.  Refers  to  miss  distance  when  the  data  are  in  terms 
of  absolute  distance  from  the  aimpoint  without  regard  to  bearing. 
Randomness.  An  intuitive  concept  referring  to  an  approach  that 
leads  to  disorder  and  unpredictability  of  individual  data.  Use¬ 
ful  in  determining  the  sequence  of  testing  to  minimize  such 
effects  as  environment  changes,  practices,  equipment,  wearout. 
Regression.  An  analysis  technique  to  fit  a  line  or  curve  or 
hyperplane  to  a  set  of  data.  The  technique  includes  determining 
the  coefficients  and  correlation  testing  for  significance,  etc. 
REP  (Range  Error  Probable).  A  measure  of  bombing  or  tracking 
accuracy  in  the  range  cirection.  It  is  the  interval  centered  at 
the  aimpoint  or  mean  point  that  will  include  b0%  of  the  errors 
in  the  range  axis. 

Repeatability .  A  measure  of  precision  to  include  variations 
because  of  short-term  effects  and  measurement  errors. 

RepI icatibility .  A  measure  of  precision  to  include  as  much 
variation  in  time,  area,  etc.,  as  our  testing  permits. 


Replication.  A  complete  test  of  all  test  conditions  before 
retesting  any  conditions. 

Reproducibility.  A  measure  of  precision  to  include  variations 
from  ship  to  ship,  target  to  target,  etc.,  usually  important  but 
unattainable  in  our  evaluations. 


Sample.  The  number  of  objects,  operators,  and  conditions  that  were 
actually  observed  to  represent  the  totality  of  the  population  of 
objects,  operators,  and  conditions. 

Sequential  Approach.  A  test  design  used  in  conjunction  with  a 
chart  that  delineates  possible  stoppage  of  testing  after  each  run. 
Sets  Approach.  Refers  to  testing  all  comparisons  of  interest  once 
in  a  group  before  proceeding  to  a  repeat  of  the  groups.  Also 
galled  side-by-aide  or  back-to-back  when  only  two  comparisons  are 


L1 

TTLnvolved. 


Side-by-Side.  Refers  to  testing  two  items  during  the  same  time 
frame  to  minimize  external  effects  on  the  comparison.  Also  called 
pairing  or  back-to-back. 

Significance  Level.  A  selected  chance  value  (a)  or  statistical 
threshold  used  with  test  of  significance  to  decide  whether  or 
not  two  means  are  similar  or  significantly  different. 

Standard  Deviation  (0.5).  A  parameter  in  a  normal  distribution 


measuring  precision. 

Stepwise  Regression.  A  computerized  curve-fitting  program  that 
selects  only  the  significant  variables  from  the  input  model. 
t-Distribution.  A  probability  distribution  useful  in  tests  of 
Significance  between  two  means. 


Test  Condition.  The  controlled  test  trial  with  each  test  variable 
held  at  a  particular  setting. 

Test  Design.  An  analytical  combination  of  test  variables  and 
test  settings  to  form  a  group  of  test  conditions  or  experimental 
design.  Also  includes  the  sequence  of  testing  and  sample  size 
determinations . 

Tests  of  Significance.  An  analysis  technique  to  determine  whether 
the  observed  difference  in  means  can  reasonably  (as  set  by  the 
significance  level)  be  attributed  to  chance  variation  in  the  data. 
Two-Tail  Test.  A  test  of  significance  when  the  alternate  to  the 
null  hypothesis  includes  two  critical  values  (less  them  or  greater 
than).  This  is  contrasted  with  the  one-tail  test. 

Truncated  Data.  Refers  to  data  of  a  continuous  type  that  has  a 
minimum  or  maximum  because  of  turnaway,  etc.  Also  called  cen¬ 
sored  data. 

Variance.  The  value  of  the  square  of  the  standard  deviation  or 


mean  square. 


